SQuAD

Stanford Question Answering Dataset (SQuAD) is a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span, from the corresponding reading passage, or the question might be unanswerable.

SQuAD 1.1

the previous version of the SQuAD dataset, contains 100,000+ question-answer pairs on 500+ articles.

Training example
{
    "data": [
        {
            "title": "University_of_Notre_Dame",
            "paragraphs": [
                {
                    "context": "Architecturally, the school has a Catholic character. Atop the Main Building's gold dome is a go lden statue of the Virgin Mary. Immediately in front of the Main Building and facing it, is a copper statue of Christ with arms upraised with the legend \"Venite Ad Me Omnes\". Next to the Main Building is the Basilica of the Sacred Heart. Immediately behind the basilica is the Grotto, a Marian place of prayer and reflection. It is a replica of the grotto at Lourdes, France where the Virgin Mary reputedly appeared to Saint Bernadette Soubirous in 1858. At the end of the main drive (and in a direct line that connects through 3 statues and the Gold Dome), is a simple, modern stone statue of Mary.",
                    "qas": [
                        {
                            "answers": [
                                {
                                    "answer_start": 515,
                                    "text": "Saint Bernadette Soubirous"
                                }
                            ],
                            "question": "To whom did the Virgin Mary allegedly appear in 1858 in Lourdes France?",
                            "id": "5733be284776f41900661182"
                        }
                    ]
                }
            ]
        },
        …
   ]
}
Inference example
{
	"data": [{
				"title": "Super_Bowl_50",
				"paragraphs": [{
							"context": "Super Bowl 50 was an American football game to determine the champion of the National Football League (    NFL) for the 2015 season. The American Football Conference (AFC) champion Denver Broncos defeated the National Football Conference (NFC) champion Carolina Panthers 24\u    201310 to earn their third Super Bowl title. The game was played on February 7, 2016, at Levi's Stadium in the San Francisco Bay Area at Santa Clara, California. As thi    s was the 50th Super Bowl, the league emphasized the \"golden anniversary\" with various gold-themed initiatives, as well as temporarily suspending the tradition of nam    ing each Super Bowl game with Roman numerals (under which the game would have been known as \"Super Bowl L\"), so that the logo could prominently feature the Arabic num    erals 50.",
							"qas": [{
										"answers": [{
											"answer_start": 177,
											"text": "Denver Broncos"
										}, {
											"answer_start": 177,
											"text": "Denver Broncos"
										}, {
											"answer_start": 177,
											"text": "Denver     Broncos"
										}],
										"question": "Which NFL team represented the AFC at Super Bowl 50?",
										"id": "56be4db0acb8001400a502ec"
									}, {
										"answers": [{
											"answer_start": 249,
											"text": "Carolina     Panthers"
										}, {
											"answer_start": 249,
											"text": "Carolina Panthers"
										}, {
											"answer_start": 249,
											"text": "Carolina Panthers"
										}],
										"question": "Which NFL team represented the NFC at     Super Bowl 50?",
										"id": "56be4db0acb8001400a502ed"
									}
									...
								]
								...
predict result: predictions.json
{
    "56be4db0acb8001400a502ec": "Denver Broncos",
    "56be4db0acb8001400a502ed": "Denver Broncos",
    "56be4db0acb8001400a502ee": "February 7, 2016,",
    "56be4db0acb8001400a502ef": "Denver Broncos",
    "56be4db0acb8001400a502f0": "gold",
    "56be8e613aeaaa14008c90d1": "golden anniversary",
    "56be8e613aeaaa14008c90d2": "February 7, 2016,",
    "56be8e613aeaaa14008c90d3": "American Football Conference",
    "56bea9923aeaaa14008c91b9": "golden anniversary",
    "56bea9923aeaaa14008c91ba": "American Football Conference",
    "56bea9923aeaaa14008c91bb": "February 7, 2016,",
    "56beace93aeaaa14008c91df": "Denver Broncos",
    "56beace93aeaaa14008c91e0": "Levi's Stadium",
    "56beace93aeaaa14008c91e1": "Santa Clara, California.",
    "56beace93aeaaa14008c91e2": "\"Super Bowl L\"),",
    "56beace93aeaaa14008c91e3": "2015",
    "56bf10f43aeaaa14008c94fd": "2015",
    "56bf10f43aeaaa14008c94fe": "Santa Clara, California.",
    "56bf10f43aeaaa14008c94ff": "Levi's Stadium",
    "56bf10f43aeaaa14008c9500": "24\u201310",
    "56bf10f43aeaaa14008c9501": "February 7, 2016, at Levi's Stadium in the San Francisco Bay Area at Santa Clara, California.",
    "56d20362e7d4791d009025e8": "2015",
    "56d20362e7d4791d009025e9": "Denver Broncos",
    "56d20362e7d4791d009025ea": "Denver Broncos",
    "56d20362e7d4791d009025eb": "Denver Broncos",
    "56d600e31c85041400946eae": "2015",
    "56d600e31c85041400946eb0": "Denver Broncos",
    "56d600e31c85041400946eb1": "February 7, 2016, at Levi's Stadium in the San Francisco Bay Area at Santa Clara, California.",
    "56d9895ddc89441400fdb50e": "Super Bowl 50",
    "56d9895ddc89441400fdb510": "Denver Broncos",

SQuAD2.0

Combines the 100,000 questions in SQuAD1.1 with over 50,000 unanswerable questions written adversarially by crowdworkers to look similar to answerable ones. To do well on SQuAD2.0, systems must not only answer questions when possible, but also determine when no answer is supported by the paragraph and abstain from answering.

Logo

魔乐社区(Modelers.cn) 是一个中立、公益的人工智能社区,提供人工智能工具、模型、数据的托管、展示与应用协同服务,为人工智能开发及爱好者搭建开放的学习交流平台。社区通过理事会方式运作,由全产业链共同建设、共同运营、共同享有,推动国产AI生态繁荣发展。

更多推荐