OpenAI-papers

OpenAI-papers

OpenAI 真的是十分牛,因此整理了它在 Research 上的文章,如果有代码也一起呈上,好好学习,天天向上。

2024

  • Measuring short-form factuality in large language models [pdf] [code]
  • Simplifying, Stabilizing and Scaling Continuous-Time Consistency Models [pdf]
  • First-Person Fairness in Chatbots [pdf]
  • MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering [pdf] [code]
  • Rule Based Rewards for Language Model Safety [pdf] [code]
  • Prover-Verifier Games improve legibility of LLM outputs [pdf]
  • A Holistic Approach to Undesired Content Detection in the Real World [pdf]
  • Improved Techniques for Training Consistency Models [pdf]
  • Consistency Models [pdf]
  • Scaling and evaluating sparse autoencoders [pdf] [code]
  • The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions [pdf]

2023

  • Let's Verify Step by Step [pdf]
  • GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models [pdf]
  • GPT-4 Technical Report [pdf]

2022

  • Point-E: A System for Generating 3D Point Clouds from Complex Prompts [pdf] [code]
  • Scaling Laws for Reward Model Overoptimization [pdf]
  • Robust Speech Recognition via Large-Scale Weak Supervision [pdf] [code]
  • Efficient Training of Language Models to Fill in the Middle [pdf] [code]
  • Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos [pdf] [code]
  • Evolution through Large Models [pdf]
  • Teaching Models to Express Their Uncertainty in Words [pdf]
  • Hierarchical Text-Conditional Image Generation with CLIP Latents [pdf]
  • A Research Agenda for Assessing the Economic Impacts of Code Generation Models [pdf]
  • Formal Mathematics Statement Curriculum Learning [pdf]
  • Text and Code Embeddings by Contrastive Pre-Training [pdf]

2021

  • WebGPT: Browser-assisted question-answering with human feedback [pdf]
  • Training Verifiers to Solve Math Word Problems [pdf]
  • TruthfulQA: Measuring How Models Mimic Human Falsehoods [pdf] [code]
  • Evaluating Large Language Models Trained on Code [pdf]
  • Multimodal Neurons in Artificial Neural Networks [pdf] [code]
  • Understanding the Capabilities, Limitations, and Societal Impact of Large Language Models [pdf]
  • Learning Transferable Visual Models From Natural Language Supervision [pdf] [code]

2020

  • Generative Language Modeling for Automated Theorem Proving [pdf]
  • Generative Pretraining from Pixels [pdf] [code]
  • Language Models are Few-Shot Learners [pdf]
  • Measuring the Algorithmic Efficiency of Neural Networks [pdf]
  • Jukebox: A Generative Model for Music [pdf] [code]
  • Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims [pdf]
  • Scaling Laws for Neural Language Models [pdf]

2019

  • Dota 2 with Large Scale Deep Reinforcement Learning [pdf]
  • Deep Double Descent: Where Bigger Models and More Data Hurt [pdf]
  • Leveraging Procedural Generation to Benchmark Reinforcement Learning [pdf] [code]
  • Release Strategies and the Social Impacts of Language Models [pdf]
  • Solving Rubik's Cube with a Robot Hand [pdf]
  • Emergent Tool Use From Multi-Agent Autocurricula [pdf] [code]
  • Release Strategies and the Social Impacts of Language Models [pdf] [code]
  • Generating Long Sequences with Sparse Transformers [pdf] [code]
  • Implicit Generation and Generalization in Energy-Based Models [pdf] [code]
  • Neural MMO: A Massively Multiagent Game Environment for Training and Evaluating Intelligent Agents [pdf] [code]
  • Language Models are Unsupervised Multitask Learners [pdf] [code]
  • Computational Limitations in Robust Classification and Win-Win Results [pdf]

2018

  • An Empirical Model of Large-Batch Training [pdf]
  • Quantifying Generalization in Reinforcement Learning [pdf] [code]
  • Concept Learning with Energy-Based Models [pdf]
  • Plan Online, Learn Offline: Efficient Learning and Exploration via Model-Based Control [pdf]
  • Exploration by Random Network Distillation [pdf] [code]
  • FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models [pdf]
  • Large-Scale Study of Curiosity-Driven Learning [pdf] [code]
  • Learning Dexterous In-Hand Manipulation [pdf]
  • Variational Option Discovery Algorithms [pdf]
  • Glow: Generative Flow with Invertible 1x1 Convolutions [pdf] [code]
  • Learning Policy Representations in Multiagent Systems [pdf]
  • GamePad: A Learning Environment for Theorem Proving [[pdf]][[code]](https://github.com/ml4tp/gamepad)
  • Evolved Policy Gradients [pdf] [code]
  • Gotta Learn Fast: A New Benchmark for Generalization in RL [pdf]
  • Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines [pdf]
  • Improving GANs Using Optimal Transport [pdf]
  • On First-Order Meta-Learning Algorithms [pdf] [code]
  • Some Considerations on Learning to Explore via Meta-Reinforcement Learning [pdf]
  • Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research [pdf]
  • DeepType: Multilingual Entity Linking by Neural Type System Evolution [pdf] [code]

2017

  • GPU Kernels for Block-Sparse Weights [pdf] [code]
  • Learning Sparse Neural Networks through \(L_0\) Regularization [pdf] [code]
  • Interpretable and Pedagogical Examples [pdf]
  • Meta Learning Shared Hierarchies [pdf] [code]
  • Sim-to-Real Transfer of Robotic Control with Dynamics Randomization [pdf]
  • Asymmetric Actor Critic for Image-Based Robot Learning [pdf]
  • Domain Randomization and Generative Models for Robotic Grasping [pdf]
  • Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments [pdf] [code]
  • Emergent Complexity via Multi-Agent Competition [pdf] [code]
  • Learning with Opponent-Learning Awareness [pdf] [code]
  • Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation [pdf] [code]
  • Parameter Space Noise for Exploration [pdf] [code]
  • Proximal Policy Optimization Algorithms [pdf] [code]
  • Synthesizing Robust Adversarial Examples [pdf]
  • Hindsight Experience Replay [pdf]
  • Teacher-Student Curriculum Learning [pdf] [code]
  • Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments [pdf] [code]
  • UCB Exploration via Q-Ensembles [pdf]
  • Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World [pdf]
  • One-Shot Imitation Learning [pdf]
  • Equivalence Between Policy Gradients and Soft Q-Learning [pdf]
  • Stochastic Neural Networks for Hierarchical Reinforcement Learning [pdf] [code]
  • Learning to Generate Reviews and Discovering Sentiment [pdf] [code]
  • Evolution Strategies as a Scalable Alternative to Reinforcement Learning [pdf]
  • Emergence of Grounded Compositional Language in Multi-Agent Populations [pdf]
  • Prediction and Control with Temporal Segment Models [pdf]
  • Third-Person Imitation Learning [pdf] [code]
  • PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications [pdf] [code]

2016

  • Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning [pdf]
  • On the Quantitative Analysis of Decoder-Based Generative Models [pdf] [code]
  • A Connection between Generative Adversarial Networks, Inverse Reinforcement Learning, and Energy-Based Models [pdf]
  • \(\text{RL}^2\): Fast Reinforcement Learning via Slow Reinforcement Learning [pdf]
  • Variational Lossy Autoencoder [pdf]
  • Extensions and Limitations of the Neural GPU [pdf] [code]
  • Transfer from Simulation to Real World through Learning Deep Inverse Dynamics Model [pdf]
  • OpenAI Gym [pdf]
  • Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks [pdf]

OpenAI-papers
https://blog.lfd.world/2024/11/26/openai-papers/
作者
培根请加蛋
发布于
2024年11月26日
许可协议