[
NTCIR-11 Home]
[NTCIR-11 QA Lab Task Home]
NTCIR-11 QA Lab Japanese subtask: Wikipedia Data Set
NTCIR has made it publicly available under the conditions of Creative
Commons Attribution-Share-Alike License 3.0 (Unported). Users of the Corpora
and Topics are advised to read Wikipedia's copyright policy carefully to ensure proper usage.
Download
Wikipedia Indri indexed Dataset:
- readme (readme.html) * Language: Japanese (9 KB)
- Wikipedia Indri indexed Dataset 1 (wiki-100M-txt.zip) (1,229,052 KB)
- Wikipedia Indri indexed Dataset 2 (make-index-Wikipedia.zip) (70,808 KB)
- Wikipedia Indri indexed Dataset 3 (wiki-index.tar.gz : splited from wiki-index-aa.tar.gz to wiki-index-ah.tar.gz)
wiki-index-aa.tar.gz (1,048,576 KB)
wiki-index-ab.tar.gz (1,048,576 KB)
wiki-index-ac.tar.gz (1,048,576 KB)
wiki-index-ad.tar.gz (1,048,576 KB)
wiki-index-ae.tar.gz (1,048,576 KB)
wiki-index-af.tar.gz (1,048,576 KB)
wiki-index-ag.tar.gz (1,048,576 KB)
wiki-index-ah.tar.gz (780,744 KB)
* Use mouse menu button on the link to save the file.
* If you use IE (Internet Explorer) for downloading the file, the file may be automatically uncompressed. Use some other browser for saving the compressed file as it is.
License

Use and/or redistribution of the Wikipedia search results for Japanese Entrance Exam subtask is permitted under the conditions of Creative Commons Attribution-Share-Alike
License 3.0(Unported).
Details can be found at http://creativecommons.org/licenses/by-sa/3.0/.