Data Engineer · Verition Fund · NYC

Building reliable data systems for high-stakes decisions.

I'm Ahsan, a Data Engineer at Verition Fund Management, a multi-strategy hedge fund, where I build data pipelines, platform infrastructure, and production-grade data systems.

Previously: Accenture, NYC Department of Buildings, Fujitsu, WebMD, and ASML.

Focused on finance-grade data platforms with strong reliability, data governance, and trust.

Portrait of Ahsan Fayyaz

Impact

What I deliver

95%

Deduplication accuracy in AI-assisted extraction

80%

Reduction in manual data processing

50%

Faster processing in modernized data quality workflows

40%

Increase in enterprise data quality accuracy

How I support data teams

  • Build reliable data ingestion pipelines with validation and retry controls.
  • Improve trust in critical datasets through quality checks and observability patterns.
  • Deliver cleaner production datasets faster for research and decision workflows.

Experience

Professional experience

Accenture

Jul 2022 - Mar 2026 · New York, NY

Senior Data Engineer

Jun 2025 - Mar 2026

  • Built a scalable web-scraping platform using Python, Playwright, ScraperAPI, and BeautifulSoup to ingest structured and unstructured alt-data across multiple sources.
  • Developed AI-assisted extraction pipelines (Gemini + fuzzy matching), enabling 95% deduplication accuracy and reducing manual processing by 80%.
  • Designed cloud-hosted ETL workflows on GCP Compute Engine with cron scheduling, REST integrations, retries, monitoring, logging, and automated historical storage.
  • Implemented validation and schema enforcement to maintain accuracy, freshness, and consistency across thousands of daily records.
  • Architected centralized NocoDB master and active tables with lifecycle automation, entitlement controls, and real-time synchronization.

Data Engineer

Jul 2022 - Jun 2025

  • Led design and deployment of a telecom data quality framework using AWS Deequ, Databricks, Spark, Scala, Python, and SQL.
  • Automated 30+ data quality checks, reducing validation time by 40% and improving processing speed by 50%.
  • Developed scalable ETL systems across AWS and Databricks for reliable high-volume production pipelines.

NYC Department of Buildings

Feb 2022 - Jun 2022 · New York, NY

Machine Learning Engineer

  • Designed and developed supervised ML models for the Analytics and Data Science Unit to predict high-risk buildings from historical construction injury data.
  • Engineered multi-year public datasets and built robust train/validate/test workflows with upsampling and hyperparameter tuning.
  • Evaluated Gradient Boosting, Logistic Regression, Neural Networks, K-Nearest Neighbors, and SVM to identify the most reliable model for inspection prioritization.
  • Implemented model analysis and visualization pipelines in Jupyter using NumPy, pandas, matplotlib, scikit-learn, and GeoPandas.
  • Improved risk-based inspection planning and contributed to a 35% reduction in construction incidents.

Fujitsu Network Communications

Jun 2021 - Aug 2021 · Richardson, TX

Software Engineering Intern

  • Designed and developed a cloud-based web application for the Software Business Unit (SWBU) to generate XML, JSON, and XLSX outputs for Virtuora Planning and Design workflows.
  • Built a user-friendly interface to create and download multi-format files in a single streamlined workflow.
  • Eliminated recurring format errors and reduced average manual build time from 25 hours to 30 minutes.
  • Project was adopted by a major telecom client and generated approximately $100K in revenue impact.
  • Tech stack: HTML, CSS, JavaScript, Java 8, Oracle DB, jQuery, jQuery UI, AJAX, and Bootstrap.

WebMD

Feb 2020 - May 2021 · New York, NY

Software Engineering Intern

  • Developed and contributed to multiple full-stack applications for WebMD's Consumer Runtime platform.
  • Designed and implemented a Page Performance Testing Dashboard to streamline test creation, tracking, and stakeholder approvals before production deployment.
  • Automated route creation and route validation workflows to remove manual QA steps and improve developer throughput.
  • Implemented regex-based URL validation and PostgreSQL-backed search functionality to improve routing reliability.
  • Designed user interfaces using MVC design principles for maintainable internal tooling.

ASML

Jun 2019 - Aug 2019 · Wilton, CT

Software Engineering Intern

  • Delivered two automation tools for the Reticle Stage SDEV team, including a Linux/Python app for compatibility agreement generation.
  • Reduced compatibility agreement generation from 8 hours to 10 seconds, eliminating manual errors and enabling more frequent execution.
  • Built a Python utility for rapid software build infrastructure navigation, removing the need to manually access scope files.
  • Collaborated on a joint automation initiative to streamline GTA (Google Test Assistant) input collection workflows.
  • Redesigned algorithm logic to extract testable functions from C/C++ source code, automating a repetitive unit-testing preparation step.

Skills

Technical stack

Tools and platforms I use most in production data engineering work.

Programming & Core Engineering

Python SQL Scala Java Bash Git

Cloud & Infrastructure

S3 Lambda ECS / ECR EventBridge Glue Snowflake Databricks

Data Orchestration & DevOps

Apache Airflow Docker Kubernetes CI/CD Terraform

Data Modeling & Architecture

Star Schema Snowflake Schema Schema Design (OLTP vs OLAP) Data Partitioning Strategies

Data Quality & Governance

Deequ Data Validation Rules Schema Enforcement Metadata Management Data Lineage

Data Collection & Automation

Scrapy BeautifulSoup Playwright Selenium ScraperAPI

Certification

AWS Certified Cloud Practitioner (CLF-C01)

Amazon Web Services

Let's connect

Want to get in touch?