I have last Friday picked up a dictionary of the Mahu dialect of Eastern Votic (Castreanianumin toimitteita 27, 1986), based on Lauri Kettunen’s collections from about a hundred years ago. 
This is not a particularly huge book, with only about 150 pages of lexical data, set in a relatively large monotype font, too. It probably won’t be of much use if one wished to e.g. translate Firefox into Votic. Its usability as tourist dictionary might be limited as well (even if we ignore the sad fact that Votic is hard moribund, with only some dozens of speakers left). But it seems like a good reference for a linguist wishing to make some contact with the language. Or: a handy unit of data for a linguist wishing to understand the lexical structure of languages.
The lexicons of natural languages are not random in their makeup. Phonemes have differing frequencies of occurrence in different positions of words; and different tendencies of combining with each other. And although one can certainly find linguists who will attempt to offer explanations in terms of elaborate synchronic phonological constraints and preferences, I find this a fundamentally flawed approach.  Much more often, any patterns evident in the lexicon are best understood as the fossilized results of historical processes: sound changes, loanword strata and evolving standards of sound-symbolic conventions. The study of a language’s lexicon even at a single point in time will likely turn up insights into its history.
For this type of analysis, this Votic dictionary actually seems like a rather good sample size. The lexicon of any major literary language would be both overwhelming in size (possibly thousands of pages); as well as swamped with recent cultural loanwords (if you happen to find a word shaped approx. like /banana/ or /platinum/ in a given language, this will not tell you much about its prehistory). Neither of these problems is apparent here, and it’s possible to focus on the big picture without getting stuck on data wrangling. On the other end, a simpler list yet of say 100 words, whether artificially truncated or recorded in passing in 1820 from some now-extinct language, would not allow for many statistically significant conclusions at all.
A simple starter example: the Finnic languages have, originally, not contrasted voicing in obstruents (as was the case already in Proto-Uralic). This situation still remains in place in Estonian, Northern Karelian, and dialects of Finnish. Votic, however, sits on the side of the siblings to have fully embraced voicing, and contrasts voiced and voiceless versions of all obstruent consonants: /p t tš k f s š/ ≠ /b d dž g v z ž/. Suppose we were to hand a copy of this dictionary to a linguist who’s never worked with Finnic before. Will they be able to uncover this older constraint?
The answer seems likely to be “yes”. Only minor etymological analysis is required — which the dictionary itself provides, even. The lexemes in the dictionary are glossed in both Russian and Finnish, the two major contact languages of Votic. Additionally, several words identifiable as recent Russian loans are indeed so marked. This allows an initial separation of the lexicon to two mostly disjoint layers: those of Finnic vs. Russian background. (Though of course Finnish has some Russian loanwords as well, and small amounts of words whose origin is not immediately obvious can also be found.)
A look at words beginning with voiced obstruents other than /v/, as well as words beginning with /f/ shows that they, as a rule, belong in the Russian layer. This is a small set to begin with, and after this cleanup, no more than seven counterexamples remain:
- balalaittaag ‘to gossip’
- bëëg ‘isn’t’
- borissag ‘to bubble’
- bulissag ‘to bubble’
- börö ‘ironing board’
- däädi ‘some relative’
- filissaag ‘to whistle’
So we have four onomatopoetic verbs, one unstressed particle, one nursery word, and one fully legit content word. This is not sufficient evidence to postulate the voicing contrast to be original in the initial position, not when evidently inherited words beginning with /p t tš k s v/ number multiple hundreds altogether. 
A more detailed examination would find that medial voiced consonants other than /v/ can similarly be shown to be secondary — they occur as the consonant gradation alternants of the voiceless ones. Exceptions, as a rule, again occur only in Russian loans and probably some onomatopoeia. The full details would be more difficult to dig up though, so I am leaving this as an excercise for the interested reader. ;)
 In case anyone else is interested, some overflow stock of these from dunno where is still up for grabs at the University of Helsinki’s Dept. of Finno-Ugric Studies (Metsätalo/Unioninkatu 40, 4th floor).
 This may not be an entirely fair comparison, but… I have in mind the image of a “generative geologist” attempting to locate physical constraints present in gneiss or sediment that force its minerals to hold a macroscopically banded rather than homogenous structure.
 I will not dwell on /š/, also mainly a loanword phoneme.