Abstract:The inference cost of Large Language Models (LLMs) is a significant challenge due to their computational demands, specially on tasks requiring long outputs. However, natural language often contains redundancy, which presents an opportunity for optimization. We have observed that LLMs can generate distilled language-concise outputs that retain essential meaning, when prompted appropriately. We propose a framework for saving computational cost, in which a shorter distilled output from the LLM is reconstructed into a full narrative by a smaller model with lower inference costs. Our experiments show promising results, particularly in general knowledge domains with 20.58% saved tokens on average with tiny decrease in evaluation metrics, hinting that this approach can effectively balance efficiency and accuracy in language processing tasks.
Abstract:Once there is a decision of rebalancing or updating a portfolio of funds, the process of changing the current portfolio to the target one, involves a set of transactions that are susceptible of being optimized. This is particularly relevant when managers have to handle the implications of different types of instruments. In this work we present linear programming and heuristic search approaches that produce plans for executing the update. The evaluation of our proposals shows cost improvements over the compared based strategy. The models can be easily extended to other realistic scenarios in which a holistic portfolio management is required