ProfitSolv is a SaaS business services provider for the legal and accounting industry. We are looking for an AI Data Engineer to join our growing team!  

We are seeking a seasoned AI Data Engineer to support in building a centralized data platform on AWS to unify data across the entire portfolio and power the next generation of AI-driven experiences for our customers.

This is a greenfield building. We have a clearly defined three-phase architecture (Lakehouse standup → database migration → zero-ETL endgame) and executive sponsorship. We’re looking for an engineer who is an equal parts data engineer and AI builder - someone who can write dbt models in the morning and design a RAG pipeline in the afternoon.

What we provide:  

  • Opportunity to Invest in Your Future. We offer a 401K match.  
  • Paid Time Off. Enjoy paid time off and paid holidays.  
  • Great Coverage. Take advantage of health, dental, and vision HSA and FSA policies.  
  • A Great Team. Collaborate with smart, curious, hardworking individuals.  
  • Performance Compensation. Be rewarded for your hard work with performance-based merits.  
  • Remote Work. Want to work from home? No problem! 

 
As an AI Data Engineer, you will: 

  • Build and maintain a medallion Lakehouse (Bronze/Silver/Gold) on S3 using Apache Iceberg, Glue Data Catalog, and dbt Cloud with the Athena adapter.
  • Configure and manage AWS DMS for ongoing CDC from ~1,000 SQL Server instances. Build ECS Fargate tasks for SaaS API ingestion. Orchestrate it all with Amazon MWAA (Airflow).
  • Write dbt Cloud models for Bronze → Silver → Gold transforms. Define business metrics in the dbt Semantic Layer so BI tools and AI agents can consume them.
  • Manage Redshift Serverless + Spectrum as the read engine for analysts and BI tools. Tune Iceberg table layouts, partitioning, and compaction for performance.
  • Implement Lake Formation tag-based governance for multi-product data isolation. Onboard new acquisitions to the platform in weeks, not months.
  • Build batch embedding pipelines for legal documents and client records. Manage vector storage in OpenSearch Serverless or pgvector on Aurora.
  • Design and ship RAG pipelines: chunking strategies, retrieval ranking, context window management - for legal domain use cases.
  • Build MCP servers that expose the dbt Semantic Layer and data platform APIs to AI agents (Claude, internal copilots, customer-facing features).
  • Ensure compliance, security, and governance through IAM roles, encryption policies, and metadata cataloging.
  • Other duties as assigned. 

This position follows established policies and procedures to keep confidential information secure. 

A great fit for this position has: 

  • 5+ years of hands-on data engineering experience, with a strong focus on AWS services such as S3, Glue, Athena, and Redshift (or equivalent platforms).
  • Proven experience building and maintaining production-grade data models using dbt (Core or Cloud), including testing, macros, and documentation best practices.
  • Experience implementing Change Data Capture (CDC) patterns using tools such as AWS DMS, Debezium, or similar technologies from relational databases.
  • Demonstrated ability to design, build, and operate production Airflow DAGs (MWAA or self-hosted environments).
  • Hands-on experience developing at least one production-ready Retrieval-Augmented Generation (RAG) pipeline, including data chunking, embedding generation, vector storage, and retrieval mechanisms.
  • Strong proficiency in SQL (primary language) and Python (for data pipelines and AI workflows), with working knowledge of TypeScript for MCP server development.
  • Experience with infrastructure-as-code tools such as Terraform or equivalent solutions.
  • Comfortable working across the full technology stack in a high-autonomy environment, with the ability to make architectural decisions and drive solutions independently.

Additional Desirable Qualifications   

  • Experience building MCP servers or similar AI tool-use integrations.
  • dbt Cloud Semantic Layer / MetricFlow experience.
  • Worked with Apache Iceberg, Delta Lake, or Hudi in production.
  • SQL Server → Aurora PostgreSQL migration experience.
  • Worked in PE-backed, multi-product, or M&A-heavy environments.
  • Legal tech, payments, or professional services domain knowledge.
  • Ability to sit for prolonged periods at a desk and work on a computer.  
  • Must be able to lift up to 15 pounds at times.  
  • Ability to handle stress  
  • Ability to meet work deadlines  

Our commitment to you: At ProfitSolv, we are committed to being a diverse and inclusive workplace as an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, age, national origin, protected veteran status, disability status, sexual orientation, gender identity or expression, marital status, genetic information, or any other characteristic protected by law. We embrace a diverse group of backgrounds and experiences to connect with clients, solve problems, and innovate. 
 
Work location: Remote – U.S. only 

This is a full time position

Subscribe to be notified of new jobs

Personal Information









Attachments

Other Information