OpenAI-papers
OpenAI-papers
OpenAI 真的是十分牛,因此整理了它在 Research 上的文章,如果有代码也一起呈上,好好学习,天天向上。
2024
- Measuring short-form factuality in large language models [pdf] [code]
 - Simplifying, Stabilizing and Scaling Continuous-Time Consistency Models [pdf]
 - First-Person Fairness in Chatbots [pdf]
 - MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering [pdf] [code]
 - Rule Based Rewards for Language Model Safety [pdf] [code]
 - Prover-Verifier Games improve legibility of LLM outputs [pdf]
 - A Holistic Approach to Undesired Content Detection in the Real World [pdf]
 - Improved Techniques for Training Consistency Models [pdf]
 - Consistency Models [pdf]
 - Scaling and evaluating sparse autoencoders [pdf] [code]
 - The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions [pdf]
 
2023
- Let's Verify Step by Step [pdf]
 - GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models [pdf]
 - GPT-4 Technical Report [pdf]
 
2022
- Point-E: A System for Generating 3D Point Clouds from Complex Prompts [pdf] [code]
 - Scaling Laws for Reward Model Overoptimization [pdf]
 - Robust Speech Recognition via Large-Scale Weak Supervision [pdf] [code]
 - Efficient Training of Language Models to Fill in the Middle [pdf] [code]
 - Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos [pdf] [code]
 - Evolution through Large Models [pdf]
 - Teaching Models to Express Their Uncertainty in Words [pdf]
 - Hierarchical Text-Conditional Image Generation with CLIP Latents [pdf]
 - A Research Agenda for Assessing the Economic Impacts of Code Generation Models [pdf]
 - Formal Mathematics Statement Curriculum Learning [pdf]
 - Text and Code Embeddings by Contrastive Pre-Training [pdf]
 
2021
- WebGPT: Browser-assisted question-answering with human feedback [pdf]
 - Training Verifiers to Solve Math Word Problems [pdf]
 - TruthfulQA: Measuring How Models Mimic Human Falsehoods [pdf] [code]
 - Evaluating Large Language Models Trained on Code [pdf]
 - Multimodal Neurons in Artificial Neural Networks [pdf] [code]
 - Understanding the Capabilities, Limitations, and Societal Impact of Large Language Models [pdf]
 - Learning Transferable Visual Models From Natural Language Supervision [pdf] [code]
 
2020
- Generative Language Modeling for Automated Theorem Proving [pdf]
 - Generative Pretraining from Pixels [pdf] [code]
 - Language Models are Few-Shot Learners [pdf]
 - Measuring the Algorithmic Efficiency of Neural Networks [pdf]
 - Jukebox: A Generative Model for Music [pdf] [code]
 - Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims [pdf]
 - Scaling Laws for Neural Language Models [pdf]
 
2019
- Dota 2 with Large Scale Deep Reinforcement Learning [pdf]
 - Deep Double Descent: Where Bigger Models and More Data Hurt [pdf]
 - Leveraging Procedural Generation to Benchmark Reinforcement Learning [pdf] [code]
 - Release Strategies and the Social Impacts of Language Models [pdf]
 - Solving Rubik's Cube with a Robot Hand [pdf]
 - Emergent Tool Use From Multi-Agent Autocurricula [pdf] [code]
 - Release Strategies and the Social Impacts of Language Models [pdf] [code]
 - Generating Long Sequences with Sparse Transformers [pdf] [code]
 - Implicit Generation and Generalization in Energy-Based Models [pdf] [code]
 - Neural MMO: A Massively Multiagent Game Environment for Training and Evaluating Intelligent Agents [pdf] [code]
 - Language Models are Unsupervised Multitask Learners [pdf] [code]
 - Computational Limitations in Robust Classification and Win-Win Results [pdf]
 
2018
- An Empirical Model of Large-Batch Training [pdf]
 - Quantifying Generalization in Reinforcement Learning [pdf] [code]
 - Concept Learning with Energy-Based Models [pdf]
 - Plan Online, Learn Offline: Efficient Learning and Exploration via Model-Based Control [pdf]
 - Exploration by Random Network Distillation [pdf] [code]
 - FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models [pdf]
 - Large-Scale Study of Curiosity-Driven Learning [pdf] [code]
 - Learning Dexterous In-Hand Manipulation [pdf]
 - Variational Option Discovery Algorithms [pdf]
 - Glow: Generative Flow with Invertible 1x1 Convolutions [pdf] [code]
 - Learning Policy Representations in Multiagent Systems [pdf]
 - GamePad: A Learning Environment for Theorem Proving [[pdf]][[code]](https://github.com/ml4tp/gamepad)
 - Evolved Policy Gradients [pdf] [code]
 - Gotta Learn Fast: A New Benchmark for Generalization in RL [pdf]
 - Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines [pdf]
 - Improving GANs Using Optimal Transport [pdf]
 - On First-Order Meta-Learning Algorithms [pdf] [code]
 - Some Considerations on Learning to Explore via Meta-Reinforcement Learning [pdf]
 - Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research [pdf]
 - DeepType: Multilingual Entity Linking by Neural Type System Evolution [pdf] [code]
 
2017
- GPU Kernels for Block-Sparse Weights [pdf] [code]
 - Learning Sparse Neural Networks through \(L_0\) Regularization [pdf] [code]
 - Interpretable and Pedagogical Examples [pdf]
 - Meta Learning Shared Hierarchies [pdf] [code]
 - Sim-to-Real Transfer of Robotic Control with Dynamics Randomization [pdf]
 - Asymmetric Actor Critic for Image-Based Robot Learning [pdf]
 - Domain Randomization and Generative Models for Robotic Grasping [pdf]
 - Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments [pdf] [code]
 - Emergent Complexity via Multi-Agent Competition [pdf] [code]
 - Learning with Opponent-Learning Awareness [pdf] [code]
 - Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation [pdf] [code]
 - Parameter Space Noise for Exploration [pdf] [code]
 - Proximal Policy Optimization Algorithms [pdf] [code]
 - Synthesizing Robust Adversarial Examples [pdf]
 - Hindsight Experience Replay [pdf]
 - Teacher-Student Curriculum Learning [pdf] [code]
 - Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments [pdf] [code]
 - UCB Exploration via Q-Ensembles [pdf]
 - Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World [pdf]
 - One-Shot Imitation Learning [pdf]
 - Equivalence Between Policy Gradients and Soft Q-Learning [pdf]
 - Stochastic Neural Networks for Hierarchical Reinforcement Learning [pdf] [code]
 - Learning to Generate Reviews and Discovering Sentiment [pdf] [code]
 - Evolution Strategies as a Scalable Alternative to Reinforcement Learning [pdf]
 - Emergence of Grounded Compositional Language in Multi-Agent Populations [pdf]
 - Prediction and Control with Temporal Segment Models [pdf]
 - Third-Person Imitation Learning [pdf] [code]
 - PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications [pdf] [code]
 
2016
- Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning [pdf]
 - On the Quantitative Analysis of Decoder-Based Generative Models [pdf] [code]
 - A Connection between Generative Adversarial Networks, Inverse Reinforcement Learning, and Energy-Based Models [pdf]
 - \(\text{RL}^2\): Fast Reinforcement Learning via Slow Reinforcement Learning [pdf]
 - Variational Lossy Autoencoder [pdf]
 - Extensions and Limitations of the Neural GPU [pdf] [code]
 - Transfer from Simulation to Real World through Learning Deep Inverse Dynamics Model [pdf]
 - OpenAI Gym [pdf]
 - Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks [pdf]
 
OpenAI-papers
      https://blog.lfd.world/2024/11/26/openai-papers/