How In-House Counsel Buy Legal AI Tools That Stick

AI is everywhere in legal tech right now, and the volume of the conversation has made it harder, not easier, to decide what to actually buy. Some lawyers experiment with everything. Others have not touched it and are not sure they want to. Either way, the practical question is the same. What does it take to choose an AI contract tool your team will keep using six months from now.

This How to Contract webinar was hosted by Nate Kostelnik and featured Alexandra Sepulveda, Associate General Counsel at Trust & Will, and Bensu Aydin, Commercial Counsel at Figma. Both have been in commercial contracting for more than a decade, and both have spent the past year working through what it means to bring AI into an in-house workflow. That made the conversation less about hype and more about the small, concrete decisions that determine whether a tool sticks.

The discussion moved through where to start when you are a skeptic, how to run pilots and demos that tell you something real, why legal-specific tools often beat general models, how to read vendor accuracy claims, the security questions that are not negotiable, how to integrate AI with your existing stack, and where human judgment still has to stay in the loop.

Here are our top ten takeaways from the speakers' comments during the webinar:

Start with low-risk, high-volume work. When you are getting started, pick the use case where a mistake costs little and the time saved is large. The NDA is the classic entry point, but event contracts and other high-volume agreements work too. Layer your company's AI tolerance and your own professional comfort before you choose. This keeps your first experiment from being something you cannot afford to get wrong.
Borrow your community's experience before you spend your own time. Your listservs and lawyer group chats already hold most of the evaluation you are about to do. Ask what people use and how it is working. A referral can also get you a free or reduced-cost pilot. There is no reason to discover from scratch what a peer can tell you in a sentence.
Walk into a pilot knowing what success looks like. Bensu's team decided what to measure before piloting, talking to peers about quality, ease of use, and fit. Measuring AI outcomes is hard, but a pilot with no metric is just a trial you forget to evaluate. Decide in advance what a good result would actually look like for your team. Then the pilot can give you an answer instead of an impression.
Run demos on your workflow, not the vendor's script. Ask your own questions and watch whether the tool can follow you. A presenter walking through a rehearsed flow tells you very little about your work. If a vendor cannot switch gears to your scenario, that is a signal. The demo you control is worth far more than the demo they have practiced.
Choose legal-specific tools for concrete reasons, not branding. A legal-purpose tool may route prompts to the best underlying model, fit your workflow, and make usage tracking easier. It often ships with prompts built by people who think about contracts the way you do. A general model means you write every prompt yourself. Decide based on whether you have the bandwidth to recreate what the legal tool gives you.
Treat accuracy claims as a prompt to test, not a fact to trust. A vendor's accuracy percentage is not something you can act on. While humans stay in the loop, the number changes very little about how you work. If a vendor does cite a figure, use it as your cue to run a live test on your own contracts. What matters is whether the tool follows your instructions, not whether it claims to do your thinking.
Count cognitive load and clicks as real criteria. A clean, uncrowded interface invites your team to actually use the tool. Every extra click adds friction, and for a profession still adjusting to this much technology, that friction costs more than it would elsewhere. The most powerful tool loses to the one your team will open every day. Weigh usability alongside capability.
Keep humans in the loop, and be specific about why. Right now you hold more context than the tool does, and judgment gets built by living through situations a model never sees. Alexandra's deposition story about shifted redaction tape is the kind of catch a human makes because they are in the room. Use AI for the first pass and the stress test. Keep the final judgment with the lawyer.
Make security and confidentiality gates, not comparison points. A tool that trains on your data is off the table, and you need a real security baseline before anything else. Decide whether the business gets access at all, and what they are allowed to self-serve. Watch for the tool that just wants to please, because hard legal conversations need genuine back and forth. These questions come before feature comparison, not during it.
Spend saved time on the work you could never reach. Faster review is not the real prize. The prize is the renewal-tracking memo, the communications plan, and the stress-tested draft you finally have bandwidth to produce. Build the record of what you gave up and what you want back at renewal. That is the strategic work these tools are supposed to unlock, and now you can point to it.

Whether you joined this webinar live or are catching up now, our weekly newsletter keeps you close to what is next from How to Contract, including upcoming webinars and recaps like this one. Subscribe now so the practical insights reach you even when your calendar does not let you attend.