On-chain news breaks as Microsoft unveils Fara-7B, its first 7B-parameter model for computer tasks. Built on Qwen 2.5-VL (7B), it supports 128k context and was trained for 2.5 days on 64 H100 GPUs. The model processes screenshots and text, predicting thought chains and actions. It can handle tasks such as reservations and travel planning. Safety features pause at critical steps like personal information input. Fara-7B will be released under the MIT license on November 24, 2025. Crypto news circles are monitoring for potential integration into DeFi and blockchain tools.

AIMPACT News, May 16 (UTC+8): Microsoft has launched Fara-7B, its first 7B-parameter agent small language model designed specifically for computer usage scenarios. Built on a multimodal decoder architecture, the model accepts screenshot images and textual context to directly predict chains of thought and operational actions with parameters. Based on Qwen 2.5-VL (7B), it supports a 128k context length and was trained for 2.5 days on 64 H100 GPUs, released under the MIT license on November 24, 2025. Fara-7B perceives browser inputs via screenshots and, combined with internal reasoning and historical state tracking, predicts the next action and its parameters (e.g., click coordinates), relying on a large-scale fully synthetic dataset for training. The model can plan and execute high-level tasks such as booking a restaurant, applying for a job, or planning a trip. For safety alignment, it employs robust post-training methods with key-point recognition capabilities, rejecting seven categories of policy-violating tasks and pausing operations at critical stopping points such as entering personal information or completing purchases. Users can deploy and interact with the model via the GitHub repository, vllm, and fara-cli tools, primarily for automating web-based tasks. (Source: InFoQ)