Here you can edit the background of the section

Case Studies

Find out how we've helped our clients and created movements that digital power transformation.


Read our Case studies, white papers, articles and more.

Machine Learning & AI / Case Study

Daemon builds an AI-powered historical chat engine

Daemon builds an AI-powered historical chat engine is a history-tech startup pioneering a new way to tell historical and family stories. They bring historical figures and Holocaust survivors to life that users can interact with directly. Through technology, users can interact with a reconstructed person through video chat and text messaging and have candid conversations about their life and history, making history accessible in a way that has never been done before. 

In the first instance, focuses on Holocaust survivors and historical figures. Later on in the roadmap, users can interact with people from their own family history.  


The challenge

Know-me came to Daemon with a preliminary tech demo, with the intention that we work collaboratively to improve and refine the AI behind the demo.  A key requirement was to improve the latency of the system which was too high for a satisfying user experience, and the responses from the historical figures which were not true to their personality. Safety was also a key concern as the personae need to be interesting and engaging without being inaccurate or inappropriate, which requires a systematic and iterative approach to development. 

Know-me were keen to get an improved working demo deployed to present to the investment community. 


Our approach

Daemon started by implementing Llama2 with TGI on Amazon Sagemaker Real-time Endpoints, a change that we knew would immediately improve the latency alongside changes to the back-end orchestration of the chat model. 

After initial load and latency testing, Daemon found the latency was improved by a factor of approximately three. However, the best Llama2 model, Llama-70b, necessary for the best quality of answers, still had a latency on the order of 8 seconds on the best instance type (p4d.24xlarge with 8 A100 GPUs). To address this issue further, Daemon introduced the Mixtral 8x7b LLM, which vastly improved latency while maintaining the best quality available from open source models. Later, Daemon employed the Nous Research Hermes Mixtral derivative model in order to make use of the improved guidability of the model, such as a dedicated system prompt. 

Daemon designed and implemented a back-end orchestration and templating system offering control over the model through a simplified UI. This system, accompanied by an automated qualitative evaluation system, allowed Know-me to develop new scenarios and personae with minimal training while producing strong results in safety and character personality. With latency a key concern, this was designed to keep the number of calls to the LLM to a minimum.

D-ID provided AI-generated video for the demo, the WebSpeech API performed voice recognition, and 11Labs provided personalisable text-to-speech.


Architecture of the demo in AWS

Lastly, Daemon built a web app with React for use across computer and mobile screens. This app consists of an interface that emulates video calling and chat messaging for site users. Daemon adhered to best practices in designing the page and invoked security best practices.  

Along the way, Daemon introduced various best practices, including deploying the back end on ECS orchestration, Terraform infrastructure as code, logging, separation of configuration and code, automation for security checks, and the framework for introducing a proper CICD system.


The outcome

Daemon was able to:

  • Realise the client’s vision: a fully functional demo that brings historical figures to life. This can, in turn, fulfil a social service by educating users about history in a way that was never done before.  
  • Enable users to engage in audio conversations with animated figures from photos, improving user experience.
  • Address technical challenges, including reducing system latency and enhancing response specificity by incorporating AI best practices.
  • Successfully migrate the solution to the cloud, ensuring scalability and accessibility.
  • Implement user interface and chat functionality, enhancing overall usability and accessibility for's target audience.
  • Provide guidance and implementation around best practices so that Know-me is close to a production-ready product.

As a result, Know-me was able to negotiate support from partners and suppliers and sell their demo across funding and developer channels



Daemon worked tirelessly, closely aligning with our needs to enhance our LLM's response quality and sensitivity significantly, while also reducing latency to manageable levels. They delivered this through a user-friendly interface, effectively bringing our concept to life for presentation to potential leads. I highly recommend Daemon.

Joshua Balla-Muir, CTO,


If you’d like to know more about how we do things at Daemon