AI agents — Papers

2 papers found

Task-Based Engineering

Dmitry I. Sviridenko

Mathematics & AI · Vol. 1, No. 2 · Jul 2026

The article describes the main methodological principles of a new scientific-engineering discipline — task-based engineering, which represents a synergistic integration of the task-based approach, engineering and agent-based AI. The features of each component of task-based engineering are examined, including their advantages and disadvantages. A conceptual model of the new discipline is presented, and its key principles, methodological framework and application potential are elucidated. Differences from related fields are highlighted, and the challenges and development prospects are outlined.

task-based engineering weakly structured need task-based approach domain semantic model solution criterion contextual conditions task formalization semantic modelling engineering prompt engineering context engineering agent-based AI AI agents multi-agent systems

DOI: 10.66693/mathai.1038

Detecting Hallucinations In LLM Responses Using Token-level Log-probability Signals

Vadim Eliseev, Aleksandra Yurievna Maksimova

Mathematics & AI · Vol. 1, No. 2 · May 2026

Large language models (LLMs) have proven themselves to be powerful tools for many natural language tasks — from being a high-quality text classifiers to acting as agents in complex retrieval-augmented generation (RAG) systems. However, from early beggining they suffer from a major limitation: hallucinations, i.e. confidently generating incorrect or misleading information that can also slightly correlate with the given task. This issue is critical in error-sensitive domains such as finance, medicine, and law, where even small inaccuracies can cause significant harm and detriment. In this study we address the early detection of hallucinating answers based on user input (prompt), answer by the LLM, and which is more important — token-level probabilty signals that can also be extracted from the LLM during its inference time. We constructed a dataset that combines textual information with sequences of token log-probabilities and their statistics (mean, min, variance, percentiles, etc.), labeled the answers whether they are hallucinations or not. We trained a lightweight classifier that outputs the probability that a given response is a hallucination. We evaluate the classifier and perform ablation studies to quantify the contribution of token-level signals versus text-only features. The intended use of the trained model is to be a standalone output guard agent in multi-agent system that rejects the answer of LLM-generator if its hallucination probability is above acceptance threshold and protects the users of it from having incorrect or misleading answer by making the whole system regenerate such answer or confirm that it cannot give the faithfull reply.

LLM text classification RAG NLP dataset construction AI agents machine learning

DOI: 10.66693/mathai.1025

Published Papers

Filters

Popular Keywords

About