Published Papers

27 papers found
Estimating Importance of Highly Correlated Features Using Matrix Factorization
Mathematics & AI · May 2026
Hyperspectral images contain a large volume of source data that exhibits high correlations along neighboring spectral bands. This makes it necessary to select the most informative features among correlated groups of features to effectively solve various machine learning problems. A method of feature importance evaluation for hyperspectral image data is proposed. This method combines iterative training of Decision Tree classifiers based on spectral features with matrix factorization to overcome sparsity. Decision trees provide intrinsic feature selection mechanism but only a small number of features are usually taken into account by the CART algorithm for training a single decision tree classifier instance. Furthermore when features are highly correlated (e.g., Pearson $\rho > 0.8$), tree-based methods like Random Forest or XGBoost arbitrarily assign importance to one feature while suppressing others, as they redundantly capture the same signal. To overcome this problem, an additional balancing term was incorporated into the optimization function used to obtain the matrix factorization. The considered method of feature importance evaluation is compared with such model-specific tree-based methods as vanilla Gini impurity decrease and more complicated Boruta algorithm. Classification accuracy is tested using a Random Forest classifier on significant features. Selecting features with higher importance scores yields models boasting greater training accuracy.
Detecting Hallucinations In LLM Responses Using Token-level Log-probability Signals
Mathematics & AI · May 2026
Large language models (LLMs) have proven themselves to be powerful tools for many natural language tasks — from being a high-quality text classifiers to acting as agents in complex retrieval-augmented generation (RAG) systems. However, from early beggining they suffer from a major limitation: hallucinations, i.e. confidently generating incorrect or misleading information that can also slightly correlate with the given task. This issue is critical in error-sensitive domains such as finance, medicine, and law, where even small inaccuracies can cause significant harm and detriment. In this study we address the early detection of hallucinating answers based on user input (prompt), answer by the LLM, and which is more important — token-level probabilty signals that can also be extracted from the LLM during its inference time. We constructed a dataset that combines textual information with sequences of token log-probabilities and their statistics (mean, min, variance, percentiles, etc.), labeled the answers whether they are hallucinations or not. We trained a lightweight classifier that outputs the probability that a given response is a hallucination. We evaluate the classifier and perform ablation studies to quantify the contribution of token-level signals versus text-only features. The intended use of the trained model is to be a standalone output guard agent in multi-agent system that rejects the answer of LLM-generator if its hallucination probability is above acceptance threshold and protects the users of it from having incorrect or misleading answer by making the whole system regenerate such answer or confirm that it cannot give the faithfull reply.
Operator Learning for High-Dimensional Symplastic Growth Dynamics with Stochastic Cell Division
Mathematics & AI · May 2026
We study operator learning for a nonlinear dynamical system describing symplastic plant leaf growth with multiple interacting cell files and stochastic cell division. The biomechanical model consists of coupled ordinary differential equations governing visible cell lengths, relaxed wall lengths, isosmotic lengths, and shared wall fragments. For $N$ cell files with $M$ cells per file, the state dimension scales as $D(N,M) \sim 3NM + K(N,M)$, where $K$ denotes the number of shared fragments. While the number of state variables grows linearly in $N$, fragment-based mechanical coupling induces a rapidly increasing interaction structure, leading to dense Jacobians and growing computational cost of numerical integration. In the multi-file regime, repeated simulation becomes computationally prohibitive for parameter exploration and inverse calibration. We formalize the simulator as a nonlinear operator $\mathcal{F} : \Theta \subset \mathbb{R}^p \to \mathbb{R}^B$ mapping mechanical parameters to the longitudinal cell length profile. We train multilayer perceptron (MLP) surrogates to approximate $\mathcal{F}$ using simulator-generated data. The learned surrogate replaces repeated ODE integration and enables fast prediction of spatial growth profiles. We evaluate generalization performance on held-out parameter configurations and demonstrate efficient parameter calibration to experimental profiles. We further analyze structural properties of the parameter-to-profile map, including local regularity and an intrinsic stochastic noise floor induced by random cell division. Our results show that neural operator approximation provides a scalable framework for accelerating analysis and inverse modeling of coupled high-dimensional biological growth dynamics.
Mathematics of natural Intelligence
Mathematics & AI · May 2026
In the process of evolution, the brain has achieved such perfection that artificial intelligence systems do not have and which needs its own mathematics. The concept of cognitome, introduced by the academician K.V. Anokhin, as the cognitive structure of the mind – a high–order structure of the brain and a neural hypernetwork, is considered as the basis for modeling. Consciousness then is a special form of dynamics in this hypernetwork – a large-scale integration of its cognitive elements. The cognitome, in turn, consists of interconnected COGs (cognitive groups of neurons) of two types – functional systems and cellular ensembles. K.V. Anokhin sees the task of the fundamental theory of the brain and mind in describing these structures, their origin, functions and processes in them. The paper presents mathematical models of these structures based on new mathematical results, as well as models of different cognitive processes in terms of these models. In addition, it is shown that these models can be derived based on a fairly general principle of the brain works: the brain discovers all possible causal relationships in the external world and draws all possible conclusions from them. Based on these results, the paper presents models of: “natural” classification; theory of functional brain systems by P.K. Anokhin; prototypical theory of categorization by E. Roche; theory of causal models by Bob Rehter; theory of consciousness as integrated information by G. Tononi.
The Evolution of Mind: Emergence of Collective Intelligence through Logical-Probabilistic Knowledge Dynamics in Multi-Agent ENIGMA Metaverse Ecosystems
Mathematics & AI · May 2026
Modern metaverse platforms, populated by heterogeneous multi-agent systems (MAS), generate vast streams of experiential data whose epistemic value remains largely untapped. This paper introduces the Enigma framework — a formal theory of collective intelligence emergence in metaverse ecosystems, grounded in a novel logical-probabilistic learning theory that extends classical first-order logic with probabilistic confidence annotations and distributed knowledge semantics. We define a Distributed Knowledge Lattice (DKL) over multi-agent interactions and prove that, under precisely stated monotonicity and convergence conditions, the collective knowledge of an agent population forms a complete lattice whose least upper bound represents an emergent cognitive state unreachable by any individual agent. We formalize the dynamics of knowledge creation, verification, and propagation through a system of Logical-Probabilistic Agents (LP-agents) that interact with LLM-driven entities, providing trustworthy and explainable reasoning via symbolic proof chains stored on a multi-blockchain ledger. Central results include: (i) a Collective Intelligence Convergence Theorem establishing conditions under which the system's aggregate knowledge monotonically approaches a fixed point; (ii) a Probabilistic Inference Soundness Theorem guaranteeing that confidence propagation through distributed reasoning chains preserves logical consistency; and (iii) a polynomial-time algorithm for optimal knowledge retrieval from the distributed lattice. The framework is instantiated within the Enigma Metaverse architecture, where smart contracts govern knowledge tokenization, cross-chain knowledge interoperability protocols enable seamless sharing, and decentralized governance mechanisms ensure epistemic accountability. We demonstrate that this synthesis of mathematical logic, probability theory, LLM-based hypothesis generation, and blockchain-secured knowledge persistence provides a rigorous foundation for building self-optimizing, trustworthy, and explainable collective intelligence (CI) in virtual worlds.
AI-Based Detection of Unwanted Behavior: The Paradoxical Effect of Standard Data Augmentation in Video Surveillance
Mathematics & AI · May 2026
Public spaces and commercial environments face persistent challenges regarding human misconduct. Traditional surveillance remains passive, while manual monitoring is labor-intensive and inefficient. Consequently, automating the detection of unwanted behavior via digital assistants is essential. This study explores the application of deep learning models to identify such behavior and analyzes the impact of data augmentation on model performance. We utilized the UCF-Crime dataset (1,900 videos, 13 classes) and constructed a novel custom dataset comprising 5,236 videos with a focus on ``violent behavior.'' Algorithms based on ResNet3D and DenseNet-RNN were trained on both original and augmented versions of these datasets. On the UCF-Crime dataset, the DenseNet-RNN model showed stability only in detecting the ``arson'' class. However, the model's stability and recognition quality improved significantly when trained on the custom dataset. Crucially, our experiments demonstrated that data augmentation negatively impacted the recognition quality on the custom dataset ($F(1,36)=7.40, p=0.01$). Specifically, the ResNet3D method exhibited a significant degradation in performance, with the AUC dropping from 0.83 (95\% CI: $[0.80-0.86]$) before augmentation to 0.73 (95\% CI: $[0.69-0.77]$) post-augmentation. The DenseNet-RNN method showed a similar downward trend (AUC 0.82, 95\% CI: $[0.79-0.85]$ vs. AUC 0.75, 95\% CI: $[0.71-0.79]$). These findings suggest that the blind application of standard augmentation procedures—a common industry practice for both static images and video frames—can lead to a paradoxical decline in detection quality. This highlights the urgent need to critically reevaluate how augmentations are selected and to develop task-specific methodologies that ensure transformations actually improve, rather than hinder, model efficacy in behavioral recognition.
Explainable AI for Mathematics: Proofs as Code with Knowledge Graph and Domain Ontology Support
Mathematics & AI · May 2026
We investigate whether structured knowledge retrieval from a mathematical library's dependency graph can improve neural theorem proving at inference time while maintaining explainability of the retrieved context. Through a controlled ablation study on the miniF2F-Test benchmark (197~tasks, 8~non-chain-of-thought modes, $K{=}8$ attempts per mode), we find that graph-based retrieval augmentation significantly improves proof generation on hard problems -- those where the base model's parametric knowledge is insufficient -- while having no measurable effect on problems the model already solves reliably. On 109~hard tasks, the best graph-augmented mode nearly triples the baseline success rate (pass@1: $3.56\% \to 10.42\%$, $+6.8$~percentage points, $p{=}0.001$, Wilcoxon signed-rank test). Deterministic pattern-based graph entry points outperform LLM-generated ones ($11\times$ faster, higher accuracy). The retrieved graph context is fully traceable: each hint maps to a specific edge in the Mathlib dependency graph, enabling the user to verify \emph{why} particular lemmas were suggested. The approach is training-free, composable with any base prover, and adds negligible computational overhead.
Hybrid UAV Hazard Detection Approach Based on Open-Vocabulary Detection and MIVAR Expert System
Mathematics & AI · May 2026
The integration of neural network methods in computer vision with logical infer- ence based on a Mivar expert system allows leveraging the advantages of both paradigms: high efficiency in processing unstructured visual data and the inter- pretability of decisions made based on formalized rules. An analysis of various computer vision tasks was conducted, demonstrating that OVD (Open-Vocabulary Detection) is the preferred tool for dynamic rescue operation scenarios. OVD pro- vides the best balance between the flexibility of detecting arbitrary categories, reliability when working with multiple objects, and the availability of data for training. Modern vision-language architectures, their features, and advantages were investigated. YOLO-World was selected as the base model, as it best meets the stringent requirements of real-time operation, achieving high processing speed while maintaining the flexibility of an open vocabulary. A fine-tuning procedure for the model was carried out, which included freezing the text encoder and the early layers of the convolutional backbone, as well as combining the Flickr30k, VisDrone, and SARD2 datasets. Using Flickr30k helped preserve the quality of the vision-language space, while the specialized datasets adapted the model to real-world application conditions. The fine-tuned model showed a significant in- crease in accuracy (mAP@50 rose from 0.0974 to 0.342, and mAP@50:95 from 0.0673 to 0.202), and also gained the ability to correctly recognize human poses specific to rescue operations and types of vehicles in drone imagery. A system of parameters and rules for the Mivar Expert System (MES) is proposed, which allows transitioning from simply listing objects in an image to a comprehensive situation assessment. This transforms the system into an active operator assistant, capable not only of detecting but also of interpreting threats. Thus, the developed hybrid intelligent system, combining the YOLO-World detector and the Mivar expert system, fully meets the stated goal and specified requirements: • Real-time operation due to the optimized architecture; • Flexibility of control through text prompts in natural language; • Interpretability and logical validity of decisions thanks to Mivar logical in- ference.
Random forest regression and Shapley additive explanation for effective dose rate estimation in high-energy neutron fields based on Bonner spectrometer measurements
Mathematics & AI · May 2026
The article proposes a method for assessing the neutron energy spectrum and effective dose rate of personnel based on the readings of a Bonner spectrometer (BSS) for high-energy neutron fields. Neutron flux density can be obtained fromBSS measurements by solving the system of Fredholm integral equations of the first kind. In our paper the spectra were unfolded using supervised machine learningalgorithm ”random forest” with optimization of the model hyperparameters. The model was trained and tested on a database of 251 spectra for various powerfacilities (80% of data was used for training the model, while 20% was used fortesting it). The input features of the model were the spectrometer readings for BSSmoderator spheres and the categorical feature ”spectrum type” describing the facilityand conditions under which the spectrum was obtained. The output parametersof the model were the spectrum description in the form of a histogram for 60energy values, as well as the dose rate calculated from the spectrum for the correspondingconversion factor. Since the dataset of real spectra is small, database of104 synthetic data generated using the Frascati Unfolding Interactive Tool methodwas developed. Second model for this synthetic dataset was trainted and comparedwith the first one. The effect of the error in the initial data on the spectrumand the dose rate obtained from it was estimated by the Monte Carlo method usingrandom samples. The test dataset showed that the unfolded spectra are closein nature to the original ones and have a high correlation with them. The paperproposes a method for selecting the optimal number of moderator spheres basedon the explainable artificial intelligence method ”Shapley additive explanation”(SHAP). The SHAP method was used to demonstrate the degree of influence ofmeasurements with moderator spheres of different diameters on the spectrum prediction. It was shown that resulting spectrum is most influenced by measurementswith moderator spheres of 10” and 12”. Optimization of the choice of spheresleads to a decrease in the personnel doses during measurements. The model wastrained and calculations were performed on the JINR Multifunctional Informationand Computing Complex.
Application of blurry models for semantic modelling of object domains
Mathematics & AI · May 2026
Semantic modelling plays an important role in data processing, enabling a deep understanding of information and the development of intelligent systems. One of the methods is a four-level model of knowledge representation including ontological, theoretical, empirical and statistical levels. The problem of incomplete knowledge makes it difficult to describe axioms in an object domain. The paper discusses an approach in which a precedent model (third level) is created based on precedent knowledge and then, through its fuzzification, statistical knowledge (fourth level) is obtained. This probabilistic knowledge is objective. However, in some domains subjective expert estimates may also be used. In such cases, the process starts with the creation of a blurry (fuzzy) model. The paper proposes a mathematical apparatus for reconstructing a set of precedents based on these estimates and describes the properties of blurry models.
Implementation of a Cryptographic Hash Function Based on a Deep Neural Network.
Mathematics & AI · May 2026
We present a two-layer construction for image hashing. First, a \emph{perceptual} binary code $c(x)$ is derived from a ResNet-18 embedding (after global average pooling, $d=512$) via a linear projection and sign quantization; optionally, a real-valued serialization of length $n=8ds$ bits is used. The code $c(x)$ enables fast approximate nearest-neighbor search: we empirically measure robustness to permissible transforms (low intra-BER), separability of unrelated pairs (inter distances near $n/2$), bit balance and weak inter-bit correlations, and we estimate a lower bound on the source min-entropy. Second, $c(x)$ serves as a noisy source for a \emph{fuzzy extractor} producing a reproducible secret $R$ and public data $P$; a cryptographic tag $T$ is then derived via KDF and HMAC/SHA-3. This preserves similarity search over $c(x)$ while assigning cryptographic guarantees (preimage/second-preimage/collision) to $T$, which reduce to the security of the underlying primitives given sufficient post-publication min-entropy $H_\infty(C,|,P)$. We discuss limitations of perceptual hashes (adversarial examples) and parameter selection ($n$, error-correction radius $t$, secret length $|R|$) driven by measured BER distributions and min-entropy estimates.
Aligning the Number of Parameters with the Number of Linear Regions for Improved Neural Network Approximation
Mathematics & AI · May 2026
The paper addresses the "black box" problem of neural networks by analyzing the approximation properties of latent layers. It proposes that a key limitation preventing the practical achievement of universal approximation theorems is the mismatch between the growth rates of a network's parameters and the number of linear regions partitioning the input space. The question is examined how this imbalance is exacerbated in multidimensional cases, hindering effective learning. To resolve this, methods are suggested to align parameter counts with the number of linear regions, such as moving activations vectors to the surface of a hypercube, utilizing micro-columns, and leveraging the "blessing of dimensionality" in deep networks to decouple complex signals.
Deep Learning for Educational Video Analysis: Benchmarking ASR Systems and Pipeline Optimization
Mathematics & AI · May 2026
We present a comparative analysis of eight managed commercial speech recognition providers (provider-side preprocessing, segmentation, and serving) for educational video transcription and enrichment, evaluated on over 700 lecture recordings (900+ hours) across disciplines. The Fireworks whisper-v3-turbo endpoint offers a favorable cost–quality–latency trade-off versus surveyed alternatives. Audio preprocessing reduces billed duration by 10–25% with negligible accuracy loss. Prompt-based “Video Vocabulary” reduces terminology errors without fine-tuning. We implement a parallel pipeline that cuts end-to-end turnaround from over 30 minutes of manual effort per recording to under two minutes, supports up to 50 concurrent jobs, and achieves roughly 22× speedup at about $0.075 per hour of content for transcription plus pedagogical enrichment (summaries, chapter topics, self-check questions) at list prices. The system is deployed in production.
OLORA+: A HYBRID APPROACH TO PARAMETER- EFFICIENT FINE-TUNING OF LARGE LANGUAGE MODELS
Mathematics & AI · May 2026
Parameter-Efficient Fine-Tuning (PEFT) is essential for adapting Large Language Models (LLMs) under resource constraints, yet existing methods often treat initialization and optimization as separate concerns. This paper introduces OLoRA+, a novel hybrid approach that synergistically combines the structural stability of Orthonormal Low-Rank Adaptation (OLoRA) with the accelerated convergence of LoRA+. By initializing adapter matrices via QR decomposition of pre-trained weights and applying differential learning rates to the upstream and downstream projection matrices, OLoRA+ aims to enhance both stability and feature learning speed. We evaluated the method on the LLMs models using a subset of the Alpaca instruction-following dataset. Empirical results demonstrate that OLoRA+ consistently outperforms the standard OLoRA baseline across Evaluation Loss, BLEU, and ROUGE metrics without incurring additional computational costs. Crucially important that our analysis uncovers two distinct effective learning regimes: a ”Refinement” strategy (learning rate ratio λ < 1) that optimizes the initial orthonormal basis, and an ”Exploration” strategy (λ>1) that seeks new parameter directions. These findings suggest that OLoRA+ offers a more versatile and robust framework for efficient LLM adaptation than its predecessors.
Separate Adjustment of Linear and Nonlinear Parameters in Neural Network Training
Mathematics & AI · May 2026
The paper examines the limitations of the backpropagation error (BPE) method in neural network training, particularly its tendency to converge to suboptimal local minima. Traditional backpropagation-based training often suffers from inefficiencies in high-dimensional and complex optimization landscapes, which limits its effectiveness in deep learning applications. A modified neuron model is proposed, featuring adjustable parameters for nonlinear transformations such as ReLU and SoftPlus, which are adapted independently from connection weights. Unlike conventional models, which rely solely on weight optimization, our approach introduces independent parameter tuning for nonlinear transformations, allowing for more efficient exploration of the loss landscape. Based on vector-matrix analysis, the paper introduces an improved formal neuron model that reduces the likelihood of convergence to local minima far from the globally optimal solution. In the proposed model, the output activity is expressed as the sum of linear activation and its nonlinear transformation. This approach significantly enhances training speed and, in particular, approximation accuracy by introducing tunable parameters into the nonlinear function and optimizing them separately from the adjustment of input connection weights. The proposed model was evaluated on function approximation tasks of varying complexity in two- and three-dimensional spaces. The results demonstrate a 3–10 times reduction in training time and up to three orders of magnitude improvement in accuracy, especially for SoftPlus activation. These findings suggest that the proposed neuron model could be beneficial for deep learning applications requiring high precision and efficient training, such as medical imaging and autonomous systems. Additionally, the results emphasize the potential of vector-matrix analysis in improving neural network training methods, paving the way for further exploration of specialized optimization techniques.
Hybrid Bi-Level Index for Shortest Paths in Temporal Networks
Mathematics & AI · May 2026
Temporal graphs provide a natural model for dynamic relational data arising in modern AI systems, including event streams, temporal knowledge graphs, interaction networks, and transaction systems. Efficient reachability querying in such graphs constitutes a fundamental operation underlying temporal reasoning, feature extraction, and dynamic graph learning. In this paper, we propose a parameterized hybrid indexing framework for temporal reachability queries. Vertices are adaptively partitioned into two classes depending on the size of their reachable sets, enabling a controllable trade-off between memory usage and query time. Assuming a power-law degree distribution, we derive an analytical model for the proportion of promoted (large) vertices as a function of a promotion threshold. Closed-form asymptotic estimates for memory consumption and expected query time are obtained. We further prove the existence of a unique optimal threshold minimizing a combined memory–time cost functional. Theoretical predictions are validated experimentally, revealing a characteristic U-shaped dependence of query time on the promotion parameter. The results provide a mathematically grounded foundation for adaptive indexing in large-scale temporal graph analytics and AI-driven dynamic data systems.
Using Knowledge Graph in Adapting Language Model on Mathematical Texts
Mathematics & AI · May 2026
The subject of the study is the problem of adapting language models to scientific subject areas. The issues of expanding language models to mathematical subject areas are considered. It is proposed to use the knowledge graph of the subject area as a tool for `tuning' the language model. To build the knowledge graph, the ontology of the subject area of he semantic library of mathematical subject areas and their applications LibMeta is used. Navigation through the subject area is carried out using the knowledge graph and is limited by the terminology of the thesaurus and ontology links. This approach allows using the knowledge graph to create a digital assistant in a recommender system, an agent for a language model, and to feed mathematical text data to a language model.
A MONOTONE SYSTEM GENERATOR FOR SOLVING BIG DATA AGGREGATION PROBLEMS
Mathematics & AI · May 2026
It is well known that the theory of monotone systems transforms clustering from a global optimization problem (which is often NP-hard) into a successive elimination problem solvable in polynomial time. The proposed approach requires minimal a priori information: specifying only the relationship measure of one object with a subset of objects. The algorithm guarantees an exact solution to the stated extremal problem. The approach is based on the concept of a monotone system $<A,F>$, where $A$ is a finite set of objects, and $F(X)$ is an importance (or weight) function defined on subsets $(X \subseteq A)$. The monotonicity condition is: $F(X\setminus\{a\}) > F(X)$. We consider a generator of monotone systems. The proposed procedure for generating a family of monotone systems consists of two stages: i) constructing a set of transformation operators for a monotone system of a sufficiently general form, defined on the same initial set $W$, $|W| =N$; ii) constructing a set of basic functions on the set W. The desired generator is considered as a structure that generates compositional chains of operators over monotone systems selected as basis ones. The proposed extension of the class of basic functions for monotone systems is implemented in a class of Estimate Calculation Algorithms (ECA). A problem statement is formulated for defining a set of three types of basic functions in a monotone system in a class of estimator algorithms. Changing the sets of operators in the basic systems generates a family of monotone systems, which has a wide range of applications, for example, in genetic network analysis, natural language processing (NLP), image processing, and, in general, as a new tool for solving complex problems of structuring large data sets.
Dynamic Data Classification Based on a Semi-Supervised Local Poisson Label Propagation Method
Mathematics & AI · May 2026
We study semi-supervised classification in a dynamic data-stream setting, where objects and their relations evolve over time while only a small fraction of observations is labeled. Classical graph-based semi-supervised learning methods, such as label propagation and Laplacian-based regularization, typically reduce learning to the solution of a global graph problem. This requires storing the full graph and recomputing the solution whenever the graph structure changes, which becomes computationally expensive in streaming environments, especially when noisy, corrupted, or obsolete observations must be removed promptly from the model. Moreover, classical harmonic formulations degenerate in extremely low-label regimes. We propose Semi-Supervised Local Poisson Label Propagation (SSLPLP), a local Poisson-based framework for evolving graphs. The method formulates prediction updates through a graph Poisson equation with class-dependent sources and sinks induced by labeled vertices, aggregated through class supernodes. Instead of maintaining the full graph, SSLPLP keeps only a compact active neighborhood, where each newly arriving observation is connected to a limited set of active neighbors within a temporal window or k-NN structure. The key efficiency mechanism is local graph reduction via the star--mesh transformation. We prove that this reduction is exact: under the zero-sum solvability condition, elimination of zero-forcing unlabeled vertices preserves Poisson potentials on the active region. We further prove linear convergence of the iterative Poisson solver on the reduced graph, derive its spectral rate, and bound the numerical error accumulated over sequential reductions. Computational complexity is $O(\tau^{2}C)$ per streaming step, where $\tau$ is the active window size and C the number of classes, compared with $\Omega(n\tau^{2}C)$ for batch recomputation. We validate SSLPLP on two datasets: synthetic Two Moons, ECG arrhythmia classification (INCART-ECG). On temporally ordered streams, SLPLP achieves $88$--$96\%$ accuracy with as few as 2--5 labeled examples per class, consistently outperforming quantized label propagation and labels-only baselines. The framework is particularly suitable for sparse-label regimes with local temporal structure and naturally accommodates concept drift through exponentially decaying edge weights.
A Systematic Study of Gate Functions in Soft Adaptive Policy Optimization
Mathematics & AI · May 2026
Group Relative Policy Optimization (GRPO) has significantly advanced the training of large language models and enhanced their reasoning capabilities, while it remains susceptible to instability due to the use of hard clipping. Soft Adaptive Policy Optimization (SAPO) addresses this limitation by replacing clipping with a smooth sigmoid-based gate function, which leads to more stable updates. We push this theory further and investigate the impact of different gate functions on both training stability and final model performance. We formalize the key properties that admissible gates should satisfy and propose several families of such functions for empirical evaluation. This paper presents an analysis of our findings based on experiments conducted with the Qwen2.5-7B-Instruct model on mathematical reasoning tasks. These results provide practical guidance for designing smoother and more robust policy optimization objectives for large language model training.
« 1 2 »