I build end-to-end data pipelines, machine learning systems, and NLP applications that turn raw data into business decisions. Based in New York City.
From raw data ingestion to ML model deployment — I work across the full data stack.
PySpark, Apache Airflow, Databricks, Snowflake, ETL/ELT pipelines, Parquet, data modeling, PostgreSQL, Redshift
scikit-learn, XGBoost, Random Forest, feature engineering, model evaluation, hyperparameter tuning
HuggingFace Transformers, BART, FinBERT, sentiment analysis, text summarization
AWS (S3, EC2, Redshift), Azure, Docker, Kubernetes
A/B testing, hypothesis testing, pandas, NumPy, scipy, Power BI, Tableau, matplotlib
FastAPI, REST APIs, SQLAlchemy, PostgreSQL, Python, SQL