Jose Acosta Data Engineer Logo
HomeAbout MeServicesPortfolioBlog
ResumeLet's Talk
Home/About Me

I'm Jose Acosta, Data Engineer

Time-Series & Real-Time Systems | ex-Quant Trader | ex-Construction Project Manager

Building high-availability data systems where reliability isn't optional

Open to Full-Time Data Engineering Opportunities

📍 Based in Caracas, Venezuela | Available for remote work worldwide

"Data quality, latency, and reliability aren't 'tech details'—they're business risk. When a pipeline fails and decisions can't wait, you learn to build for resilience."

Python • PySparkdbt • SnowflakeAirflow • AWSHigh-availability
Let's Work TogetherView My Work
Jose Acosta - Data Engineer
High-Availability
Fault-Tolerant Systems
Cost-Optimized
Infrastructure Design
Production-Scale
Data Processing
Low-Latency
Real-Time Processing

Data Engineer with Real-World Context

I'm a Data Engineer who came up through quantitative trading. For four years I put real capital behind time-series models—where a bad join or silent failure cost money before you could roll it back. That experience taught me that every delay, quality issue, or bad assumption in data has a real business cost. Now I build the high-availability, real-time data systems I wished I'd had.

Before that, I spent four years in construction project management, learning to deliver under pressure, manage constraints, and communicate with clarity across technical and non-technical teams. That background shaped how I approach systems today: with structure, accountability, and respect for the human side of engineering.

Philosophy: Data quality, latency, and reliability aren't "tech details"—they're business risk. When a pipeline fails and decisions can't wait, you learn to build for resilience, instrument everything, and ship only what you can monitor.

How I Build

  • Product-minded: Pipelines aligned to decisions & KPIs, not just storage
  • Quality & reliability first: Unit tests + dbt data tests, SLAs/SLIs, lineage tracking
  • Cost-aware by design: Partitioning, pruning, caching, orchestration, right-sizing
  • Data contracts: Work backward from outcomes to schemas, ownership, and alerts—so when something breaks at 2 AM, the right person gets paged, not the entire team

Recent Work:

  • Real-time ingestion system: Built WebSocket → cloud warehouse pipeline with alerting and on-call playbooks that reduced data staleness from hours to seconds, enabling same-day decisions
  • Document parsing & normalization: Shipped services that transform semi-structured data into clean schemas, reducing manual data cleanup by 80%
  • Streaming/ETL pipelines: Designed systems that cut research cycles by 40% and made backtesting 5x faster through reliable data infrastructure
  • Anomaly detection services: Delivered monitoring that surfaces data quality issues before they impact downstream users or business decisions

Core Tech:

Data & Processing:

PythonSQLPySpark

Platforms & Tools:

SnowflakeDatabricksdbt

Orchestration & Infra:

AirflowDockerAWSKubernetes

Quality & Monitoring:

MLflowGreat ExpectationsSoda

Specialties: Time-Series | Streaming | Real-Time Systems | Data Modeling | ML in Production

Earlier Career

Earlier Career Context

Before tech, I managed civil-engineering projects—valuations, resource flows, construction metrics. Different domain, same lesson: decisions are only as good as the systems and data that support them.

This foundation taught me to deliver under constraints, manage stakeholder expectations, and communicate complex technical concepts to non-technical audiences. It's why I approach data engineering with structure, accountability, and an understanding that reliability isn't just about uptime—it's about enabling people to do their jobs with confidence.

What I'm Looking For

Currently: I'm seeking full-time Data Engineering opportunities with teams building mission-critical data systems. I also take on select consulting projects for startups and small businesses that need hands-on help establishing reliable data infrastructure.

Full-Time Data Engineering Roles

Positions with teams building mission-critical data systems in data-intensive products—where milliseconds matter and "close enough" breaks the business model.

Domains that interest me: Fintech (real-time pricing, risk models, trading infrastructure), E-commerce (inventory optimization, recommendation engines), Logistics (supply-chain analytics), SaaS (product analytics, usage-based billing)

Ideal team environments:
  • • High-stakes systems where downtime has immediate business impact
  • • Real-time requirements (streaming, event-driven architectures)
  • • Cost-sensitive projects where optimization affects margins
  • • Culture that treats data infrastructure as a product, not a cost center
What I bring to both contexts:
  • • Reliability-first mindset shaped by years in high-stakes environments
  • • Business impact focus—I translate between engineering and business language
  • • Production-ready systems built for observability and long-term maintenance
  • • Proven ability to deliver under pressure with cross-functional teams

Consulting & Project Work

I also partner with select startups and growing businesses that need production-grade data infrastructure but aren't ready for a full-time hire. For consulting services and project-based work,view my services page →

Happy to share repos, architectural diagrams, or walk through design decisions and trade-offs. I believe in building systems and cultures defined by clarity, empathy, and accountability.

Production Metrics in Detail

High-Availability Systems
Built fault-tolerant production pipelines with automatic failover, processing production-scale data with real-time and batch workloads, including on-call coverage and incident response
Cost-Optimized Infrastructure
Achieved through strategic partitioning, query optimization, compute right-sizing, and orchestration improvements without sacrificing performance
Low-Latency Processing
Optimized end-to-end latency for real-time ingestion pipelines, enabling rapid decision-making for time-sensitive use cases
Production-Scale Processing
Sustained throughput across batch and streaming workloads with data quality checks, lineage tracking, and automated alerting

Professional Certifications

Continuous learning and professional development through industry-recognized certifications

8
Total Certifications
8
Verified
6
Institutions
1
In Progress
IBM logo

IBM Data Engineering Professional Certificate

IBM

In ProgressProfessional Certificate

Issued May 2025

Duration: 11 months

Comprehensive professional certificate covering data engineering fundamentals, ETL processes, and cloud-based data solutions.

Skills Acquired:

Apache KafkaApache SparkETL PipelinesCloud Computing+5 more
View Credential
University of Michigan logo

Inferential Statistical Analysis with Python

University of Michigan

CompletedSpecialization Course

Issued Nov 2024

ID: R7LPZ5VW13NJ

Statistical analysis techniques using Python for data-driven decision making.

Skills Acquired:

PythonStatistical AnalysisHypothesis TestingConfidence Intervals+1 more
View Credential
University of Michigan logo

Understanding and Visualizing Data with Python

University of Michigan

CompletedSpecialization Course

Issued Nov 2024

ID: OHX0446VDLS8

Comprehensive course on data analysis and visualization techniques using Python libraries.

Skills Acquired:

Data VisualizationPythonPandasMatplotlib+2 more
View Credential
Coursera logo

Data Science Orientation

Coursera

CompletedCourse

Issued May 2025

Introduction to data science career paths and industry overview.

Skills Acquired:

Data Science FundamentalsCareer Development
View Credential
CĂłdigo Facilito logo

Python Profesional

CĂłdigo Facilito

CompletedProfessional Course

Issued Jun 2023

Curso profesional de Python para el uso en la ciencia de datos

Skills Acquired:

Python ProgrammingAlgorithmsData StructuresObject-Oriented Programming
IBM logo

What is Data Science?

IBM

CompletedCourse

ID: 8O4B21RURLGO

Introduction to data science methodologies and applications across industries.

Skills Acquired:

Data Science FundamentalsIndustry Overview
View Credential

Consultor Internacional Certificado

EspabĂ­late Consulting Group

CompletedProfessional Certification

Certification in international consulting practices and methodologies.

Skills Acquired:

International ConsultingBusiness StrategyProject Management
Kaplan UK logo

InglĂŠs Avanzado

Kaplan UK

CompletedLanguage Certification

English proficiency certification for professional environments.

Skills Acquired:

English ProficiencyBusiness CommunicationTechnical Writing

Continuous Learning Journey

Always expanding knowledge and staying current with industry trends and technologies

Let's Collaborate

Ready to WorkTogether?

Let's discuss your data challenges and create solutions that drive real business value with production-grade reliability.

Start a Conversation

Discuss your specific data engineering challenges and explore how my experience can help solve them.

Get In Touch

Explore Services

See detailed information about my data engineering services and how they can transform your business.

View Services

What makes me different:

3+ years data engineering experience
Production-tested reliability
Business impact focus

Connect with me:

Your Data Solutions Partner

Data Engineer focused on building robust data pipelines, scalable architectures, and automated workflows. Enabling teams to make smarter, data-driven decisions through reliable systems and practical engineering skills.

Useful Links

  • Portfolio
  • About Me
  • LinkedIn
  • GitHub
  • Contact

Additional Pages

  • Trading Strategies
  • Privacy Policy
  • Terms of Service

Contact

Ready to Connect?

For full-time Data Engineering opportunities or consulting projects, let's discuss how I can help build reliable data infrastructure.

Schedule CallView Services
Š 2025 Jose Acosta. All rights reserved.
Design & Development by
Jose Acosta