OpenAI-papers
OpenAI-papers
OpenAI 真的是十分牛,因此整理了它在 Research 上的文章,如果有代码也一起呈上,好好学习,天天向上。
2024
- Measuring short-form factuality in large language models [pdf] [code]
- Simplifying, Stabilizing and Scaling Continuous-Time Consistency Models [pdf]
- First-Person Fairness in Chatbots [pdf]
- MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering [pdf] [code]
- Rule Based Rewards for Language Model Safety [pdf] [code]
- Prover-Verifier Games improve legibility of LLM outputs [pdf]
- A Holistic Approach to Undesired Content Detection in the Real World [pdf]
- Improved Techniques for Training Consistency Models [pdf]
- Consistency Models [pdf]
- Scaling and evaluating sparse autoencoders [pdf] [code]
- The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions [pdf]
2023
- Let's Verify Step by Step [pdf]
- GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models [pdf]
- GPT-4 Technical Report [pdf]
2022
- Point-E: A System for Generating 3D Point Clouds from Complex Prompts [pdf] [code]
- Scaling Laws for Reward Model Overoptimization [pdf]
- Robust Speech Recognition via Large-Scale Weak Supervision [pdf] [code]
- Efficient Training of Language Models to Fill in the Middle [pdf] [code]
- Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos [pdf] [code]
- Evolution through Large Models [pdf]
- Teaching Models to Express Their Uncertainty in Words [pdf]
- Hierarchical Text-Conditional Image Generation with CLIP Latents [pdf]
- A Research Agenda for Assessing the Economic Impacts of Code Generation Models [pdf]
- Formal Mathematics Statement Curriculum Learning [pdf]
- Text and Code Embeddings by Contrastive Pre-Training [pdf]
2021
- WebGPT: Browser-assisted question-answering with human feedback [pdf]
- Training Verifiers to Solve Math Word Problems [pdf]
- TruthfulQA: Measuring How Models Mimic Human Falsehoods [pdf] [code]
- Evaluating Large Language Models Trained on Code [pdf]
- Multimodal Neurons in Artificial Neural Networks [pdf] [code]
- Understanding the Capabilities, Limitations, and Societal Impact of Large Language Models [pdf]
- Learning Transferable Visual Models From Natural Language Supervision [pdf] [code]
2020
- Generative Language Modeling for Automated Theorem Proving [pdf]
- Generative Pretraining from Pixels [pdf] [code]
- Language Models are Few-Shot Learners [pdf]
- Measuring the Algorithmic Efficiency of Neural Networks [pdf]
- Jukebox: A Generative Model for Music [pdf] [code]
- Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims [pdf]
- Scaling Laws for Neural Language Models [pdf]
2019
- Dota 2 with Large Scale Deep Reinforcement Learning [pdf]
- Deep Double Descent: Where Bigger Models and More Data Hurt [pdf]
- Leveraging Procedural Generation to Benchmark Reinforcement Learning [pdf] [code]
- Release Strategies and the Social Impacts of Language Models [pdf]
- Solving Rubik's Cube with a Robot Hand [pdf]
- Emergent Tool Use From Multi-Agent Autocurricula [pdf] [code]
- Release Strategies and the Social Impacts of Language Models [pdf] [code]
- Generating Long Sequences with Sparse Transformers [pdf] [code]
- Implicit Generation and Generalization in Energy-Based Models [pdf] [code]
- Neural MMO: A Massively Multiagent Game Environment for Training and Evaluating Intelligent Agents [pdf] [code]
- Language Models are Unsupervised Multitask Learners [pdf] [code]
- Computational Limitations in Robust Classification and Win-Win Results [pdf]
2018
- An Empirical Model of Large-Batch Training [pdf]
- Quantifying Generalization in Reinforcement Learning [pdf] [code]
- Concept Learning with Energy-Based Models [pdf]
- Plan Online, Learn Offline: Efficient Learning and Exploration via Model-Based Control [pdf]
- Exploration by Random Network Distillation [pdf] [code]
- FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models [pdf]
- Large-Scale Study of Curiosity-Driven Learning [pdf] [code]
- Learning Dexterous In-Hand Manipulation [pdf]
- Variational Option Discovery Algorithms [pdf]
- Glow: Generative Flow with Invertible 1x1 Convolutions [pdf] [code]
- Learning Policy Representations in Multiagent Systems [pdf]
- GamePad: A Learning Environment for Theorem Proving [[pdf]][[code]](https://github.com/ml4tp/gamepad)
- Evolved Policy Gradients [pdf] [code]
- Gotta Learn Fast: A New Benchmark for Generalization in RL [pdf]
- Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines [pdf]
- Improving GANs Using Optimal Transport [pdf]
- On First-Order Meta-Learning Algorithms [pdf] [code]
- Some Considerations on Learning to Explore via Meta-Reinforcement Learning [pdf]
- Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research [pdf]
- DeepType: Multilingual Entity Linking by Neural Type System Evolution [pdf] [code]
2017
- GPU Kernels for Block-Sparse Weights [pdf] [code]
- Learning Sparse Neural Networks through \(L_0\) Regularization [pdf] [code]
- Interpretable and Pedagogical Examples [pdf]
- Meta Learning Shared Hierarchies [pdf] [code]
- Sim-to-Real Transfer of Robotic Control with Dynamics Randomization [pdf]
- Asymmetric Actor Critic for Image-Based Robot Learning [pdf]
- Domain Randomization and Generative Models for Robotic Grasping [pdf]
- Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments [pdf] [code]
- Emergent Complexity via Multi-Agent Competition [pdf] [code]
- Learning with Opponent-Learning Awareness [pdf] [code]
- Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation [pdf] [code]
- Parameter Space Noise for Exploration [pdf] [code]
- Proximal Policy Optimization Algorithms [pdf] [code]
- Synthesizing Robust Adversarial Examples [pdf]
- Hindsight Experience Replay [pdf]
- Teacher-Student Curriculum Learning [pdf] [code]
- Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments [pdf] [code]
- UCB Exploration via Q-Ensembles [pdf]
- Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World [pdf]
- One-Shot Imitation Learning [pdf]
- Equivalence Between Policy Gradients and Soft Q-Learning [pdf]
- Stochastic Neural Networks for Hierarchical Reinforcement Learning [pdf] [code]
- Learning to Generate Reviews and Discovering Sentiment [pdf] [code]
- Evolution Strategies as a Scalable Alternative to Reinforcement Learning [pdf]
- Emergence of Grounded Compositional Language in Multi-Agent Populations [pdf]
- Prediction and Control with Temporal Segment Models [pdf]
- Third-Person Imitation Learning [pdf] [code]
- PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications [pdf] [code]
2016
- Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning [pdf]
- On the Quantitative Analysis of Decoder-Based Generative Models [pdf] [code]
- A Connection between Generative Adversarial Networks, Inverse Reinforcement Learning, and Energy-Based Models [pdf]
- \(\text{RL}^2\): Fast Reinforcement Learning via Slow Reinforcement Learning [pdf]
- Variational Lossy Autoencoder [pdf]
- Extensions and Limitations of the Neural GPU [pdf] [code]
- Transfer from Simulation to Real World through Learning Deep Inverse Dynamics Model [pdf]
- OpenAI Gym [pdf]
- Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks [pdf]
OpenAI-papers
https://blog.lfd.world/2024/11/26/openai-papers/