OEMs should be more aggressive about an ‘AI First’ approach

William Wei, EVP and CMO, Skymizer

Generative AI has made massive progress in the last few years and may substantially change how drivers interact with their cars. We talked to William Wei, EVP and CMO at Skymizer, an AI infrastructure company based in Taiwan. Mr. Wei gave us some interesting insights into how in-vehicle AI will develop on both the software and hardware levels.

Mr. Wei, firstly, what is your general definition of a Software-Defined Vehicle?

The traditional software in vehicles used to be static, frozen behavior. So, you write computer code in C or Python, then compile it into the machine code that will be frozen in the hardware. I call this static software or software 1.0. The software-defined vehicle or SDV, as I understand it, is much more than that. Today’s AI agents use Large Language Models − LLM − and work at runtime, the system is dynamic. When you talk to an agent in the car, the compilation and reasoning workflow happens at runtime, and it is non-deterministic, so it is not frozen. I call that software 2.0; the trend is toward agentic, or ‘AI first’. However, the software-defined performance is very much hardware-dependent because the software’s capabilities depend on the processing units’ capabilities. When we design the future of SDV, it should use the software and hardware co-design approach.

In-vehicle voice recognition systems are not always very flexible. How can we make cars understand and communicate in a more human-like fashion?

Siri or Alexa, for example, are again frozen systems. These traditional systems cannot improve themselves. But in the future, when you talk to the frontier model − put simply, that’s the model that defines the AI working principle, for example GPT or Gemini − that’s already human-like. This has improved dramatically, and with the help of Agentic AI, it is improving faster than we expected. Through fine-tuning and domain knowledge training, it’s getting better with more accuracy; we don’t have those traditional problems in voice recognition anymore. It primarily depends on whether, and to what extent, an OEM wants to use it. I suggest pursuing an ‘AI first’ approach to reap the whole benefit, and to be more aggressive about this.

What exactly will LLM-based agents do in automotive applications?

So far, voice recognition systems have been deterministic, meaning no randomness is involved. You test all the corner cases, so basically include scenarios that rarely occur, then you say „passed”, and there is a fixed system. With LMM, the possibilities are almost endless. The system understands what you want, and its capability is not limited by a fixed system. To enable this, you need to put in an API architecture − an application programming interface − something you did not have in traditional systems. The LLM can use that API to invoke more complex commands. For example, instead of just saying, „Wind down the window,” you might say, „Wind down that window by 20 percent”. And it also enables „fuzzy” commands, for instance: „I need it a little bit warmer.” The LLM has what we call a short-term memory. But in the future, it will be able to remember the way you express your commands or requests. So, if you say, „It’s a little bit hot for me,” the system may ask you to clarify, make a suggestion, and remember the outcome. Incidentally, the definition of an agent in a vehicle is different from the definition of an agent in, say, ChatGPT. ChatGPT is just a conversation; an agent will also complete a task. It will act as your personal assistant.

How does that work in technical terms?

Firstly, an LLM is the foundation of agentic AI. And to run this in a vehicle, some accelerator technology is needed to run the LLM on an edge device. Put simply, this means a specialized chip that can process LLM queries and answers. If you don’t have that AI infrastructure in-car, you can use the cloud to have an inferencing capability outside the vehicle. Since 2024, there has been a new definition of an AI PC that companies such as Intel or Qualcomm are pushing. There is a dedicated system-on-chip, or SOC, for AI tasks, a dedicated microprocessor. As an example, this is how Tesla handles things like object detection or classification from the local vision model. If you don’t have that in-car, you must do the AI inferencing via 5G and the cloud. In future, dedicated, AI-capable hardware will be required on board. Some OEMs have presented ChatGPT systems in cars, for example, Mercedes at the CES 2024. But that’s not Edge computing because they transfer data to the cloud for AI processing.

Why not use the cloud all the time?

There are two critical aspects here. The first is privacy. Every time you call the cloud via your Apple iPhone for example, their privacy rules will apply, and they may use your data. That can be prevented with in-car Edge computing. Usually, safety, security, and privacy aspects are essential for car makers and owners. The second aspect is that sometimes, the network − and thus, the computation − simply does not work. But your system must always be working, especially where the safety aspects are concerned. And you know, a car is so personal − it knows where you are and what you’re doing … and hackers may want to get hold of that data as more cars become connected. That’s much more difficult when the processing is isolated within the car. Eventually, we will probably use a hybrid system when deploying AI inferences. So, depending on the complexity of a question, the system either asks the cloud or relies on the onboard LLM. Sometimes, the system needs a bigger ‘brain’ to solve a question. But usually, the local LLM will be sufficient. That’s my vision of an in-car AI infrastructure.

Finally, what is your company’s role in automotive AI development, and how will hardware technology develop?

I joined Skymizer because they are experts in the compiler field, and they know chips − they are not AI experts per se; they don’t compete in training models. I define the products and business models for automotive AI applications. We know that the chip design for AI relies heavily on compiler technology. So, these are application-specific chips, not general-purpose chips. The customization is mainly about acceleration. We have lived in the age of general-purpose CPUs for a long time. But that is not efficient. The Nvidia GPUs, for example, have been used for some time because they accelerate many AI-specific functions pretty well. However, a GPU is still relatively general-purpose for AI use in all aspects. Suppose you want to narrow down to specific AI tasks. In that case, you have what we call convolutional or transformer-based training, which requires more specific instructions for optimal efficiency. And that’s what Skymizer delivers − the most efficient and affordable software and hardware co-designed solutions for AI SoCs. This can reduce costs by a factor of ten and boost the performance at the same time, compared to a GPU.

Interview: Gernot Goppelt