AI Engineer
Build the workspace where AI agents are built and fixed
TL;DR
We're hiring an AI Engineer to design and ship LLM based workflows in Neatlogs: trace summaries, comparisons, suggestions, and eval flows that make debugging agents less painful. You'll work directly with the founder and the lead backend engineer to turn messy real world agent runs into clear, useful tools for serious builders.
What is Neatlogs
A specialised workspace for teams that build AI agents.
The problem: Most debugging tools today are built for infra and AI/ML engineers. They're hard for business teams to use, even though they're the only ones with the full context needed to ship AI agents.
So tech and business teams pass agent runs (traces) around in Slack, email, screenshots, and long calls. It's a frustratingly slow process, and they end up losing important details. That's how feedback ends up being vague, and scattered across multiple platforms. And the agents never actually ship (only ~5% of them make it to production).
The fix: Neatlogs is a shared workspace, built on top of traces. Devs and business teams look at the same trace, see what went wrong, and turn feedback into code changes in minutes (instead of days).
Every wave of tech got its multiplayer workspace. Code got GitHub. Design got Figma. Agents don't have one yet. That's what Neatlogs is.
We're a small, focused team with early users and a real product in the wild. We're backed by top investors from the US, India and Japan.
Here's a walkthrough of Neatlogs from AI dev tool expert Tyler Reed:
Why this is a good hill to die on
- Agents are moving into production, but most teams still debug them with log files and chat threads. No one has nailed how agent debugging should work. You will help define that in code: how runs are stored, sliced, searched, analysed and compared.
- Most “AI products” are still demos. You will build agentic workflows that run in production and are used every day by real teams.
- You will report directly to the founder. You will work very close to users and product, not hidden behind a long chain of PMs.
- No one has built a platform that non technical teams can use and that devs still respect. You will build for both at once. That is a hard engineering problem.
Here's a short deck on our vision:
View the Neatlogs vision deck →How you'll spend your days
- Design and build agentic workflows that help teams see and fix agent behavior. For example: trace summaries, run comparisons, clusters, and suggestions for fixes.
- Build production grade automations for different teams. For example: support, ops, data, or sales workflows that run on top of agents and tools.
- Decide how and what to log for LLM and agent runs, so they are easy to inspect, replay, and debug.
- Work with the lead backend engineer to make sure your flows scale and do not slow the system down.
- Set up simple evals that tell us if a change to prompts, tools, or routing made things better or worse.
- Watch how users react to AI powered features. Simplify or remove anything that confuses them or is not reliable enough.
- Keep up with new models and tools, but only bring in things that make the product clearly better.
You might be a good fit if…
- You have at least 2 years of experience building real products with LLMs, not just toy chatbots.
- You have built agent based automations that run in production for real users.
- You are comfortable in Python and have used at least one agent or LLM framework such as LangGraph, CrewAI, LangChain or similar.
- You can design and read traces and logs, and you enjoy debugging odd model behavior.
- You can work with Postgres and a vector database, and you are happy to touch backend code beyond just calling an API.
- You have run some form of evals before, even if simple, and you care about making AI behavior more predictable over time.
- You like early stage work where you can ship something this week, see how it behaves in the wild, and improve it.
- You treat writing as part of the job. You can explain problems, decisions, and tradeoffs clearly in text so people in other time zones can move without a meeting.
Compensation and how we work
- Salary in line with top product startups, plus stock options that are large enough to matter if this works.
- We work remote first, meet in person in Gurgaon when needed, and do regular off sites for a few days so we can plan, hack, and hang out in person.
- Many of our users are AI startups in San Francisco. You will talk to them often, and maybe even visit sometimes.
- We set aside budget each year for books, courses, or conferences you want to attend.
Apply
Email your best work to people@neatlogs.com or send a DM to Ajay.