We've made remarkable progress in vision, but for true autonomy in unstructured worlds, the big gap is tactile intelligence. A robot can't just see; it has to 'feel' pressure and tension to handle objects reliably, whether that's a harvest arm or a medical device.

From Demo to Deployment, Where Robotics Actually Fails

Brandon Hetherington, Editor | Q&A with Robotics Expert Hari Unnikrishnan

Robotics has long promised to bridge the gap between intelligent systems and the physical world, but as many engineers have learned, what works in a lab rarely survives first contact with reality. In this interview with Robotics Tomorrow, Hari Unnikrishnan explains what it actually takes to take AI out of controlled environments and make it perform reliably at scale. A senior perception engineer at Orchard Robotics, Unnikrishnan has spent more than 15 years turning research into deployed systems, working across agriculture, retail robotics, and large scale spatial analytics.

With a background that spans signal processing, computer vision, and edge AI, and a Ph.D. in electrical engineering, his work focuses on a core challenge many teams underestimate, how to build systems that do not just work once, but continue to work across thousands of units, unpredictable environments, and real world constraints.
 

You began your career in electrical engineering before moving into robotics and edge AI. What initially drew you to this space, and how did that transition shape your perspective on building intelligent systems?

I started in electrical engineering with a focus on signal processing. It’s a very abstract field, but once I started applying it to image and audio processing, the math became concrete for me. Moving from there into pattern matching, computer vision, and eventually AI didn't feel like a career shift. It felt like a logical evolution. Because I’ve worked through every layer of the system, from the raw signal to the final decision, I’m able to look past the AI hype. It has shaped me to stay focused on the actual objectives of the system and, more importantly, how to use these concepts to deliver a practical benefit.

 

Much of your work has focused on taking research out of controlled environments and into real-world deployment. Where do you see the biggest gaps between innovation in the lab and what actually works at scale?

In a research lab, the primary goal is finding a novel algorithm; for instance, the time-frequency masking I developed to isolate a single voice in a noisy, multi-source environment. That’s a massive win in a controlled space. But scaling that into the real world is an entirely different category of challenge. Lab environments often ignore the engineering constraints: unit cost, long-term serviceability, and reliability under variable conditions. I saw this firsthand at KeyMe. It’s one thing to make a robot work once for a demo; it’s another to deploy thousands of robotic kiosks that have to perform consistently in diverse retail environments for millions of customers.

 

In your current role at Orchard Robotics, you are developing perception systems for precision agriculture. How is computer vision changing the way decisions are made on the ground?

The advancement in AI over the last five or six years has been dramatic. Previously, with computer vision, a massive amount of engineering effort was spent just on designing failsafes and redundancies to handle edge cases. Today, by standing on the shoulders of foundational models, computer vision is approaching or even surpassing human performance in specific industrial tasks. For example, manually estimating the yield of every tree across hundreds of acres is humanly impossible—the scale is too large and the data is too dense. But a modern CV pipeline can process that entire environment with a level of precision and speed that turns "guessing" into "actionable data." . In the ground, this changes decision-making from a game of averages to precision management at the individual plant level, allowing growers to optimize resources tree-by-tree.

 

You have led the rollout of vision-driven robotics across thousands of units. At that level of deployment, what becomes most difficult to control, and how do you maintain consistency across systems?

When scaling vision-driven robotics to thousands of units—as I did while leading engineering teams to deploy kiosks across every mainland state—the most difficult variables to control are environmental and mechanical drifts and manufacturing variances. At this level of deployment, it is physically and economically impossible to rely on manual, routine servicing. Instead, the focus must shift to building systems that are self-aware. I address this by architecting self-calibrating systems that use the robot's own vision to monitor internal geometries, allowing the machine to programmatically adjust its coordinate transforms in real-time. Essentially, the robotics self-calibrates at every output, creating a closed-loop system. This is paired with AI-based fault detection, which analyzes sensor metadata and model confidence to identify issues like lens occlusion or mechanical wear before they lead to a system failure.

 

Edge AI is increasingly central to modern robotics. When designing for real-world conditions, how do you balance performance with constraints like latency, compute, and power?

Designing for the edge is a constant exercise in managing trade-offs. I saw this clearly while architecting algorithms for thermal sensors tracking people across tens of thousands of devices. At that scale, with sensors streaming 24/7, running heavy inference on every frame simply isn't viable because the power and compute budgets of edge hardware won't support it. My approach was to layer the intelligence. I used classical image processing to filter out 'empty' frames before they ever reached the AI inference engine, which dramatically reduced the compute load. For frames that did require inference, I applied quantization and pruning to right-size the models for the hardware without sacrificing meaningful accuracy. This kept the system within its thermal envelope while maintaining the low-latency response that real-time tracking demands

 

Your experience spans retail robotics, spatial analytics, and agriculture. How do you adapt core AI and perception frameworks to perform reliably across such different environments?

The environments change, from indoor retail kiosks and office buildings to outdoor orchards, but the engineering discipline doesn't. What transfers across domains is the approach: you need to understand the failure modes of your specific environment first, then build the perception system around them. In retail robotic kiosks, reliability came from tight calibration and error correction in a controlled, repeatable setting. In spatial analytics, the challenge flipped: the sensors were fixed, but the scenes were entirely unpredictable and noisy. There, I had to design for algorithmic robustness and graceful degradation across tens of thousands of devices streaming 24/7. Now, in precision agriculture, we face fully uncontrolled outdoor conditions. The sheer variability and frequent lack of access mean resilience and observability are key. Same discipline, different environments.

 

You also created OpenGlottal, an open source project applying AI to medical diagnostics. What motivated you to explore that area, and what role do you think open source plays in accelerating applied research?

OpenGlottal is my attempt to bring a decade of industry progress back to the domain I started in. My PhD focused on vocal fold kinematics, but the clinical tools in that space hadn't kept pace with what computer vision had become. Techniques like YOLOv8 and U-Net that I was routinely applying in robotics and spatial analytics could meaningfully advance clinical diagnostics  but that transfer wasn't happening fast enough.

Open source is the right vehicle for that because adoption in healthcare is inherently slow. Making it open accelerates iteration and widens access. But it's also a two-way exchange — the clinical domain pushes back with constraints and edge cases that enrich the models in ways purely commercial work doesn't. It becomes a playground for faster, more grounded research

 

As robotics systems become more autonomous and widely deployed, where do you see the next wave of opportunity, and what challenges still need to be solved to get there?

The frontier models in AI converse in text, sound, and image/video. But the physical world is much more than that. And it is multi-modal. Looking ahead, I see the next wave in Physical AI. We’ve made remarkable progress in vision, but for true autonomy in unstructured worlds, the big gap is tactile intelligence. A robot can’t just see; it has to 'feel' pressure and tension to handle objects reliably, whether that’s a harvest arm or a medical device. The other unsolved challenge is graceful failure. At scale, the goal isn't just a system that performs; it’s a system that knows exactly what to do when it can’t.

 
The content & opinions in this article are the author’s and do not necessarily represent the views of RoboticsTomorrow

Featured Product

NVIDIA RTX PRO BLACKWELL DESKTOP GPUs

NVIDIA RTX PRO BLACKWELL DESKTOP GPUs

NVIDIA RTX PRO - Built for Professionals - NVIDIA RTX PRO Blackwell Desktop GPUs feature the latest breakthroughs in AI, ray tracing, and neural rendering technology to power the most innovative workflows in design, engineering, and beyond.