Eightify logo Chrome Extension LogoInstall Chrome extension
What's Left Before AGI? PaLM-E, 'GPT 4' and Multi-Modality

The video discusses the current state of AI, including milestones in predicting and reading images, but notes that robust machine reading and common sense reasoning are still far away, and highlights concerns about the need for independent review and limits on growth as AI continues to advance.

  • πŸš€
    00:00
    Palm E and Microsoft's new visual text image video are major milestones in AI that can predict, read faces, and answer natural language questions about images, bringing us closer to AGI.
    • Advancements in multi-modality and logical reasoning have fallen, but it is still unclear what tasks are left before calling a model artificial general intelligence.
    • Palm e and Microsoft's new visual Chaturbate text image video are major milestones that can predict what will happen next from an image, read faces, and answer natural language questions about them.
    • Palm E is an advancement on Gato and the definition of AGI is a model that is at or above the human level on a majority of economic tasks currently done by humans.
  • πŸ€–
    02:59
    Robust machine reading is still far away as it requires logic and common sense reasoning, but Bing can answer questions accurately.
    • Robust machine reading is a distant prospect as it requires logic and common sense reasoning to answer questions about a children's book.
    • Bing can answer questions accurately, but the question of what tasks are left before AGI remains, as shown by a graph from the original palm model.
  • πŸ“ˆ
    04:41
    Humans outperform Palm in recognizing ASCII numerals, while models like Bing and GPT struggle with tasks such as time tracking, text editing, and adjective order.
    • Humans outperformed Palm in various tasks, including recognizing ASCII numerals, according to the appendix of the lecture notes.
    • Models like Bing and GPT struggle with tasks such as keeping track of time in a series of events, text editing, and intuitive adjective order.
  • πŸ’»
    07:15
    The language model struggles with honesty and the challenge is to understand what is going on inside the machine.
  • πŸ“
    08:49
    Large language models can be computationally universal with access to unbounded external memory, as shown in a January paper, while a new llama model demonstrates the plateauing of model performance improvement.
    • A paper from January describes augmenting Palm with read-write memory to remember everything and process arbitrarily long inputs, including a universal turing machine, showing that large language models are already computationally universal if they have access to an unbounded external memory.
    • Improvement in model performance levels off after a certain point as shown in a paper on Messa's new llama model.
  • πŸ€–
    10:56
    AI language models can accurately answer vaguely phrased natural language questions.
    • The language model struggles with social interaction question answering and natural questions, even with trillions of tokens.
    • AI language models can answer natural language questions accurately, even if they are vaguely phrased.
  • πŸ€–
    12:44
    A 1000x increase in compute over the next 5 years could lead to human-level AI, but there are concerns about the need for independent review and limits on growth.
    • Data helps, but compute is a rough proxy for further progress and a 1000 time increase in computation over the next five years could result in human level performance across most tasks.
    • AGI may require independent review and a limit on the rate of growth of compute used for creating new models, but it's uncertain if companies like Microsoft, Tesla, or Amazon would agree with this.
    • AGI may happen in less than five years, with some arguing it's already here, and the exponential growth of knowledge in AI makes it difficult to keep up with the latest advancements.
  • 🧠
    15:56
    The debate over AGI's capabilities is subjective, while text to image generation is a new frontier led by Microsoft and Google, but rewarding models based on good process is crucial.
    • The task left before AGI is a deeper and more subjective debate, where only obscure feats of logic, deeply subjective analyzes of difficult texts, and niche areas of mathematics and science remain out of reach.
    • Text to image generations are the new story of the century, with companies like Microsoft and Google leading the way, but it's important to consider rewarding models based on good process rather than just quick outcomes.
AI-powered summaries for YouTube videos AI-powered summaries for YouTube videos
What's Left Before AGI? PaLM-E, 'GPT 4' and Multi-Modality