Data Engineering

Data processing, machine learning, and AI-powered workflows.

Category:data-ai
Subcategory:Data Engineering
199 matches
Open pytorch-patterns
pytorch-patterns.md
184.2K

from "affaan-m/everything-claude-code"

PyTorch deep learning patterns and best practices for building robust, efficient, and reproducible training pipelines, model architectures, and data loading.

2026-05-16
Open clickhouse-io
clickhouse-io.md
184.2K

from "affaan-m/everything-claude-code"

ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.

2026-05-16
Open snowflake-semanticview
snowflake-semanticview.md
33.2K

from "github/awesome-copilot"

Create, alter, and validate Snowflake semantic views using Snowflake CLI snow . Use when asked to build or troubleshoot semantic views/semantic layer definitions with CREATE/ALTER SEMANTIC...

2026-05-17
Open shuffle-json-data
shuffle-json-data.md
33.2K

from "github/awesome-copilot"

Shuffle repetitive JSON objects safely by validating schema consistency before randomising entries.

2026-05-17
Open polars
polars.md
23.4K

from "K-Dense-AI/scientific-agent-skills"

Fast in-memory DataFrame library for datasets that fit in RAM. Use when pandas is too slow but data still fits in memory. Lazy evaluation, parallel execution, Apache Arrow backend. Best for...

2026-05-17
Open pytorch-lightning
pytorch-lightning.md
23.4K

from "K-Dense-AI/scientific-agent-skills"

Deep learning framework PyTorch Lightning . Organize PyTorch code into LightningModules, configure Trainers for multi-GPU/TPU, implement data pipelines, callbacks, logging W&B, TensorBoard...

2026-05-17
Open flowio
flowio.md
23.4K

from "K-Dense-AI/scientific-agent-skills"

Parse FCS Flow Cytometry Standard files v2.0-3.1. Extract events as NumPy arrays, read metadata/channels, convert to CSV/DataFrame, for flow cytometry data preprocessing.

2026-05-17
Open chief-data-officer-advisor
chief-data-officer-advisor.md
15K

from "alirezarezvani/claude-skills"

Chief Data Officer advisory for startups: AI training data rights and consent provenance, data product strategy warehouse vs lakehouse vs mesh, build-vs-buy , B2B customer-data-as-asset...

2026-05-16
Open data-quality-auditor
data-quality-auditor.md
15K

from "alirezarezvani/claude-skills"

Audit datasets for completeness, consistency, accuracy, and validity. Profile data distributions, detect anomalies and outliers, surface structural issues, and produce an actionable...

2026-05-16
Open snowflake-development
snowflake-development.md
15K

from "alirezarezvani/claude-skills"

Use when writing Snowflake SQL, building data pipelines with Dynamic Tables or Streams/Tasks, using Cortex AI functions, creating Cortex Agents, writing Snowpark Python, configuring dbt for...

2026-05-16
Open senior-data-engineer
senior-data-engineer.md
15K

from "alirezarezvani/claude-skills"

Data engineering skill for building scalable data pipelines, ETL/ELT systems, and data infrastructure. Expertise in Python, SQL, Spark, Airflow, dbt, Kafka, and modern data stack. Includes...

2026-05-16
Open gcp-cloud-architect
gcp-cloud-architect.md
15K

from "alirezarezvani/claude-skills"

Design GCP architectures for startups and enterprises. Use when asked to design Google Cloud infrastructure, deploy to GKE or Cloud Run, configure BigQuery pipelines, optimize GCP costs, or...

2026-05-16
Open engineering-skills
engineering-skills.md
15K

from "alirezarezvani/claude-skills"

23 engineering agent skills and plugins for Claude Code, Codex, Gemini CLI, Cursor, OpenClaw, and 6 more tools. Architecture, frontend, backend, QA, DevOps, security, AI/ML, data...

2026-05-16
Open azure-cloud-architect
azure-cloud-architect.md
15K

from "alirezarezvani/claude-skills"

Design Azure architectures for startups and enterprises. Use when asked to design Azure infrastructure, create Bicep/ARM templates, optimize Azure costs, set up Azure DevOps pipelines, or...

2026-05-16
Open data-quality-auditor
data-quality-auditor.md
15K

from "alirezarezvani/claude-skills"

../../../engineering/data-quality-auditor/skills/data-quality-auditor/SKILL.md

2026-05-16
Open brainstorm-okrs
brainstorm-okrs.md
11.3K

from "phuryn/pm-skills"

Brainstorm team-level OKRs aligned with company objectives — qualitative objectives with measurable key results. Use when setting quarterly OKRs, aligning team goals with company strategy,...

2026-05-17
Open hf-cli
hf-cli.md
10.5K

from "huggingface/skills"

Hugging Face Hub CLI hf for downloading, uploading, and managing models, datasets, spaces, buckets, repos, papers, jobs, and more on the Hugging Face Hub. Use when: handling authentication;...

2026-05-17
Open ray-data
ray-data.md
8.5K

from "Orchestra-Research/AI-Research-SKILLs"

Scalable data processing for ML workloads. Streaming execution across CPU/GPU, supports Parquet/CSV/JSON/images. Integrates with Ray Train, PyTorch, TensorFlow. Scales from single machine...

2026-05-17
Open ml-training-recipes
ml-training-recipes.md
8.5K

from "Orchestra-Research/AI-Research-SKILLs"

--- name: ml-training-recipes description: Battle-tested PyTorch training recipes for all domains — LLMs, vision, diffusion, medical imaging, protein/drug discovery, spatial omics,...

2026-05-17
Open implementing-cloud-dlp-for-data-protection
implementing-cloud-dlp-for-data-protection.md
6.3K

from "mukul975/Anthropic-Cybersecurity-Skills"

Implementing Cloud Data Loss Prevention DLP using Amazon Macie, Azure Information Protection, and Google Cloud

2026-05-16
Open workflows
workflows.md
3.1K

from "HughYau/qiushi-skill"

触发:当你面临的任务明显需要多个思想武器协作时调用;常见信号包括:从零启动新项目、攻坚复杂疑难问题、对已有方案进行迭代优化。此 skill 提供标准化的跨 skill 工作流组合,解决"应该先用哪个 skill、怎么衔接"的问题。 English: Trigger when a task clearly requires multiple skills in...

2026-05-17
Open spark-prairie-fire
spark-prairie-fire.md
3.1K

from "HughYau/qiushi-skill"

触发:当你从零起步、资源极少、需要先找到最小可行切入口并建立稳定根据地时调用;常见信号包括 bootstrap、MVP、pilot、first foothold、小团队起步。 English: Trigger when starting from almost nothing and needing a viable foothold before scaling up....

2026-05-17
Open data-analytics
data-analytics.md
2.4K

from "markdown-viewer/skills"

Create data pipeline and analytics architecture diagrams using PlantUML syntax with database/analytics stencil icons. Best for ETL pipelines, data lakes, real-time streaming, data...

2026-05-16
Open data-quality-frameworks
data-quality-frameworks.md
2.1K

from "foryourhealth111-pixel/Vibe-Skills"

Implement data quality validation with Great Expectations, dbt tests, and data contracts. Use when building data quality pipelines, implementing validation rules, or establishing data...

2026-05-17