james.freire@gmail.com | (703)297-2409 (SMS)
linkedin.com/in/jamesfreire |
github.com/jamesfreire |
jamesfreire.substack.com
Principal Data Architect with 15+ years designing and implementing complex cloud-based data solutions across enterprise environments. An Amazon/AWS Alumni with proven expertise in the data ecosystem, plus consulting engagements with Amazon. Deep background in both traditional and modern data technologies, including vector databases, machine learning, and GenAI integration. Track record of leading strategic data initiatives, reducing costs by 45%, and achieving 97% enterprise compliance through innovative data architecture.
Modern Data & AI: Vector databases (TimescaleDB, pgvector), Tensorflow, Machine Learning & MLOps, GenAI/RAG architectures, Big data processing (Spark)
Traditional Data: Relational database design & optimization, Data modeling (transactional & analytics), BI dashboards (Tableau, Oracle Analytics), Data governance
AWS Data Ecosystem: Redshift, S3, Glue, Lambda, SageMaker, EMR, Bedrock, EC2
DataOps & Infrastructure: Python automation, ETL/ELT pipelines, Infrastructure as Code, CI/CD, Data testing & versioning
Programming & Analytics: Python (pandas, numpy, scikit-learn), R, SQL, Statistical modeling
September 2023 - Present
Led enterprise data architecture for multi-billion retail operations, designing governance frameworks achieving 97% agile project management compliance across all teams
Pioneered AI integration initiatives by integrating Atlassian Enterprise Insights with Large Language Models (LLM) and developing vector-based solutions for automated JIRA insights and knowledge discovery
Architected cloud-native data solutions on GCP utilizing medallion architecture, BigQuery, and Cloud Composer, serving 450,000+ associates with analytics
Developed JIRA ETL automation using Python, SQLAlchemy, and pandas for SQL database integration, eliminating manual processes and saving project managers company-wide several days quarterly
Created comprehensive BI dashboards in Tableau for sprint compliance monitoring, ensuring standardized JIRA practices across 50+ development teams
Designed factual data models covering all Home Depot project management data, establishing foundation for enterprise-wide analytics
December 2022 - Present
Leading vector database research in collaboration with TigerData and TimescaleDB teams, contributing to open-source ecosystem and publishing thought leadership on modern data stack integration
Developing enterprise applications of automating vector embeddings and RAG architectures for knowledge management systems
Architecting GenAI solutions using foundational models and frameworks for enterprise data discovery and automated insights
December 2022 - February 2023
Provided strategic consulting to AWS Security leadership on cloud-native reporting architectures and data strategy best practices
Designed scalable ETL pipelines using AWS Glue for high-volume security data transformation and ingestion across global AWS operations
June 2023 - August 2023
Orchestrated cloud migration strategy transitioning on-premises analytics platform to AWS with RStudio and SageMaker
August 2019 - November 2022
Transformed customer experience through advanced data modeling and statistical analysis, reducing Time-to-Resolution by 40% via optimized ticketing processes
Developed machine learning solutions using Python (scikit-learn) for predictive data center facility maintenance, including random forest models for root cause analysis, ticket severity classification, and incident prediction
Delivered executive intelligence through comprehensive KPI reporting and BI dashboards using R, ggplot2, and Oracle Analytics Cloud, enabling data-driven infrastructure investment decisions
Innovated monitoring capabilities through A/B statistical testing for data center systems, discovering 10%+ accuracy discrepancies and preventing infrastructure failures
March 2012 - July 2017
Architected enterprise data consolidation strategy, designing ETL workflows to centralize Amazon’s control system data into unified Redshift data warehouse supporting billions in infrastructure decisions
Pioneered innovative solutions including “Alexa for the Data Center” voice interface application, enabling real-time performance monitoring—first-of-its-kind solution adopted across AWS global operations
Developed regression models in Python (scikit-learn) to optimize guided workflows for mechanical maintenance engineers, improving operational efficiency
Led cross-facility deployment of data-driven workflow tools across all AWS data centers, resulting in enhanced uptime and significant reduction in facility outage risks
July 2008 - September 2010
Managed $250M fund through advanced data analytics and algorithmic trading strategies, reducing execution slippage by 45% through post-trade analysis and broker algorithm customization
Implemented effective trading strategies using Python, R, and SQL for backtesting and algorithmic transaction cost analysis
B.A. History - University of Connecticut (2010)
Licensed Skydiver - USPA
Extra Class Amateur Radio Operator - NU3F