Corpus Information for Swahili [swa] Tanzania

Language
Swahili
ISO Code
swa   Wikipedia , Ethnologue , Glottolog , MultiTree , ScriptSource
Country
Tanzania
Corpus Name
swa_community_2017   LCC Portal
Tokens
6,266,869
Types
277,952
Sentences
290,396
Sources (URLs)
29,328
Build date
2017-10-12
Corpus Name
swa_community_2019
Tokens
19,168
Types
3,417
Sentences
1,116
Sources (URLs)
72
Build date
2019-03-26
Corpus Name
swa_community_2022
Tokens
1,037,192
Types
76,762
Sentences
43,336
Sources (URLs)
2,756
Build date
2022-12-12
Corpus Name
swa_community_2023
Tokens
1,412,549
Types
93,552
Sentences
59,342
Sources (URLs)
4,949
Build date
2023-03-02
URLs
List of URLs download
List of Domains download
Download
swa_community_2017 2017-10-12
swa_community_2019 2019-03-26
swa_community_2022 2022-12-12
swa_community_2023 2023-03-02
Contact
No contact person for this language.
Use this Contact    to add contact details.