Gemamba 2B (Base)
Video Large Language Models
The first model in the world combined a Mamba-based video encoder with an LLM. It's also the smallest model in our lineup.
Works perfectly on edge devices with limited compute.
Cheap and fast fine-tuning on videos.
Near real-time generation.
LaserGaze
Video Processing Tools
An open-source video-focused tool for real-time gaze estimation, utilizing temporal data for enhanced accuracy in tracking eye positions and calculating gaze vectors, suitable for AR, behavioral analysis and user interface control
Retrieval-Framework
RAG
A tool that converts scientific PDFs into plain text for your LLM-related needs, such as building RAGs or agents for academic knowledge. It was developed in collaboration with the LlamaIndex team.
VideoFineTune
Training Framework
Soon on GitHub!
The fine-tuning module helps developers turn a set of videos into a perfect training dataset, and then fine-tune any available TensorSense video LLM using this data.
VideoEmbeddings-Base
Embedding Model
Soon on GitHub!
This embedding model turns videos into semantic vectors. You can use it to add videos to your RAG.
Object Detection
Video Large Language Models
Soon on GitHub!
A fine-tuned version of MambaLlama 8B, trained on a specific task to find objects based on their description or visual representation and provide all changes of their location in the dynamic environment.
Gemamba: Can Mamba Beat a Transformer In a Multimodal LLM? [Updated]
Andrey Buzin
May 3, 2024
“The Tortoise Lays on Its Back, but You are Not Helping” - Navigating the Complexities of Emotional Data in AI
Nick Sheero
Feb 7, 2024
"What We Do in the Pixels" - TensorSense Research Card
Mark Ayzenshtadt
Feb 7, 2024