README 2.2 KB

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182
  1. This directory contains redistributable wordnets in further sub-directories.
  2. Structure is:
  3. proj/wn-data-lang.tab synset lemma pairs (see below)
  4. proj/LICENCE original license file (or equivalent)
  5. proj/README Any notes about the conversion
  6. proj/lang2tab.py python script to extract the data
  7. may rely on wordnet version mappings
  8. proj/wn-data-lang.tab.log any notes from the conversion
  9. proj/citation.bib the canonical citation reference(s)
  10. Note that a single directory may have wordnets for multiple languages
  11. wn-data is formatted as follows:
  12. # name<tab>lang<tab>url<tab>license
  13. offset-pos<tab>type<tab>lemma
  14. offset-pos<tab>type<tab>lemma
  15. ...
  16. name is the name of the project
  17. lang is the iso 3 letter code for the name
  18. url is the url of the project
  19. license is a short name for the license
  20. offset is the Princeton WordNet 3.0 offset 8 digit offset
  21. pos is one of [a,s,v,n,r]
  22. lemma is the lemma (word separator normalized to ' ')
  23. type is the language:relationship (e.g. eng:lemma)
  24. Example:
  25. # Thai tha http://th.asianwordnet.org/ wordnet
  26. 13567960-n tha:lemma กระบวนการทรานแอมมิแนชัน
  27. 00155298-n tha:lemma การปฏิเสธ
  28. 14369530-n tha:lemma ภาวะการหายใจเร็วของทารกแรกเกิด
  29. 10850469-n tha:lemma เบธัน
  30. 11268326-n tha:lemma เรินต์เกน
  31. This data is formatted by the Open Multilingual Wordnet Project
  32. to be used by NLTK.
  33. Please cite us if you find the aggregation useful (see citation.bib)
  34. and email us if you have any suggestions.
  35. Francis Bond ([email protected])
  36. https://omwn.org/
  37. 2021-12-05
  38. 31 languages covered (and we assume you have English):
  39. wn-data-als.tab
  40. wn-data-arb.tab
  41. wn-data-bul.tab
  42. wn-data-cmn.tab
  43. wn-data-dan.tab
  44. wn-data-ell.tab
  45. wn-data-fin.tab
  46. wn-data-fra.tab
  47. wn-data-heb.tab
  48. wn-data-hrv.tab
  49. wn-data-isl.tab
  50. wn-data-ita.tab
  51. wn-data-ita.tab
  52. wn-data-jpn.tab
  53. wn-data-cat.tab
  54. wn-data-eus.tab
  55. wn-data-glg.tab
  56. wn-data-spa.tab
  57. wn-data-ind.tab
  58. wn-data-zsm.tab
  59. wn-data-nld.tab
  60. wn-data-nno.tab
  61. wn-data-nob.tab
  62. wn-data-pol.tab
  63. wn-data-por.tab
  64. wn-data-ron.tab
  65. wn-data-lit.tab
  66. wn-data-slk.tab
  67. wn-data-slv.tab
  68. wn-data-swe.tab
  69. wn-data-tha.tab