Avatar - La Biblia de la IA - The Bible of AI™ Journal

La Biblia de la IA - The Bible of AI™ Journal

Our objective and mission: The era of Artificial Intelligence (of acronym AI in English) is a human challenge of the greatest of the future. From its proper management and control the foundations will be born, which must be firm and legislated, which will form a safe and technological tomorrow. We will try to convey the opinions of researchers, universities, media, citizens, responsible bodies and, in particular, decision-making bodies such as the European Union; farmland where tomorrow will flourish linked to technology. We intend to filter, criticize and recommend publications that are of interest to everyone; receive and publish external works to ensure greater dissemination; publish our own works (or that we receive internally according to our public and general rules at no cost to the authors) and establish bridges of union between society and those responsible. |https://editorialia.com/ |

Expand

Magazines

Expand

Flips

  • https://editorialia.com/2022/09/25/publications-r0identifier_035d2ee6677504e68a7eb8820884a335-sequence-feature-extraction-for-malware-family-analysis-via-graph-neural-network/

    Avatar - La Biblia de la IA - The Bible of AI™ Journal
    La Biblia de la IA - The Bible of AI™ Journal
  • The book is written for three audiences: (1) people finding themselves doing forecasting in business when they may not have had any formal training in the area; (2) undergraduate students studying business; (3) MBA students doing a forecasting elective.

    Avatar - La Biblia de la IA - The Bible of AI™ Journal
    La Biblia de la IA - The Bible of AI™ Journal
    flipped into To read & analyze Artificial Intelligence
    Forecasting: Principles and Practice (3rd ed)

    Forecasting: Principles and Practice (3rd ed)

    Welcome to our online textbook on forecasting. This textbook is intended to provide a comprehensive introduction to forecasting methods and to present …

  • LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale

    Large language models have been widely adopted but require significant GPU memory for inference. We develop a procedure for Int8 matrix multiplication for feed-forward and attention projection layers in transformers, which cut the memory needed for inference by half while retaining full precision performance. With our method, a 175B parameter 16/32-bit checkpoint can be loaded, converted to Int8, and used immediately without performance degradation. This is made possible by understanding and working around properties of highly systematic emergent features in transformer language models that dominate attention and transformer predictive performance. To cope with these features, we develop a two-part quantization procedure, LLM.int8(). We first use vector-wise quantization with separate normalization constants for each inner product in the matrix multiplication, to quantize most of the features. However, for the emergent outliers, we also include a new mixed-precision decomposition scheme, which isolates the outlier feature dimensions into a 16-bit matrix multiplication while still more than 99.9% of values are multiplied in 8-bit. Using LLM.int8(), we show empirically it is possible to perform inference in LLMs with up to 175B parameters without any performance degradation. This result makes such models much more accessible, for example making it possible to use OPT-175B/BLOOM on a single server with consumer GPUs.

    Avatar - La Biblia de la IA - The Bible of AI™ Journal
    LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale

    LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale

    Computer Science > Machine Learning Download a PDF of the paper titled LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale, by Tim …

  • What Do We Maximize in Self-Supervised Learning?
    «In this paper, we examine self-supervised learning methods, particularly VICReg, to provide an information-theoretical understanding of their construction. As a first step, we demonstrate how information-theoretic quantities can be obtained for a deterministic network, offering a possible alternative to prior work that relies on stochastic models. This enables us to demonstrate how VICReg can be (re)discovered from first principles and its assumptions about data distribution. Furthermore, we empirically demonstrate the validity of our assumptions, confirming our novel understanding of VICReg. Finally, we believe that the derivation and insights we obtain can be generalized to many other SSL methods, opening new avenues for theoretical and practical understanding of SSL and transfer learning.»

    Avatar - La Biblia de la IA - The Bible of AI™ Journal
    FlipboardIcon version of the Flipboard logo

    What Do We Maximize in Self-Supervised Learning?

    Abstract: In this paper, we examine self-supervised learning methods, particularly VICReg, to provide an information-theoretical understanding of …

  • Large language models have shown impressive few-shot results on a wide range of tasks. However, when knowledge is key for such results, as is the case for tasks such as question answering and fact checking, massive parameter counts to store knowledge seem to be needed. Retrieval augmented models are known to excel at knowledge intensive tasks without the need for as many parameters, but it is unclear whether they work in few-shot settings. In this work we present Atlas, a carefully designed and pre-trained retrieval augmented language model able to learn knowledge intensive tasks with very few training examples. We perform evaluations on a wide range of tasks, including MMLU, KILT and NaturalQuestions, and study the impact of the content of the document index, showing that it can easily be updated. Notably, Atlas reaches over 42% accuracy on Natural Questions using only 64 examples, outperforming a 540B parameters model by 3% despite having 50x fewer parameters.

    Avatar - La Biblia de la IA - The Bible of AI™ Journal
    FlipboardIcon version of the Flipboard logo

    Atlas: Few-shot Learning with Retrieval Augmented Language Models

    Download a PDF of the paper titled Atlas: Few-shot Learning with Retrieval Augmented Language Models, by Gautier Izacard and 9 other …