I suspect I have an answer to this written already at some other internet venue, but basically this comes down to understanding the link between word order, prosody (intonation) and
information structure - the latter is conveyed by the former two. If you cannot convey information in Latin the way the natives did, you cannot predict the way those natives conveyed information. Therefore you cannot predict what's going to follow in the text; and if you can't do that, your comprehension is severely impeded. Most of speech is formulaic and most of understanding is predicting via disambiguation and elimination - as you listen, you discard the impossible options based on context, even syllable by syllable, and most of the time you arrive at the correct option well before you hear the finished segment of speech. Mishearing what the other person has said is nothing more than arriving at the wrong option, normally because your brain makes an associative leap to something recently on your mind. It would be unfeasible to understand spoken speech only after hearing the entirety of the sentence, cross-checking all its possible interpretations and finally arriving at the correct meaning, the brain is far too non-linear and slow for this. Besides, there's much inherent ambiguity in ordinary speech which in reality your brain doesn't even register - puns exploit this.
Well, I hope you understand what I'm saying, and if this seems like too much speculation, what I'm saying is theoretically informed to at least some degree, and I find that laying stress on coming to grasp with just that link between word order, prosody and information structure has allowed me to think along with most classical texts the subject matter of which doesn't take me by complete surprise. I wouldn't go as far as saying I can listen to any prose and understand it on the first try though - I don't even necessarily understand all written English prose on the first try, or Russian for that matter.
To take your example: Prima est eloquentiae perspecuitas virtute is a non-sentence - the correct sentence is Nam et prīma est ēloquentiae virtūs perspicuitās, et... The only notable thing happening here is that
prīma is contrastively fronted in order to give it informational prominence: 'Clarity is both the
main <and not just any> virtue of eloquence, and <moreover>..."' English has contrastive fronting too, but under different circumstances ('I saw
that >
That is what I saw'), while here it marks the prominence by intonation, which in Latin was probably way less flexible.
est regularly follows any fronted word, and otherwise tends to stick to the end of the more prominent phrase. The whole is preceded by the continuity-of-reasoning discourse particle
nam and by the conjunction
et (coordinated to another
et later)
.
But in the absence of fronting this same word order is still totally standard: in Prīma ēloquentiae virtūs perspicuitās (est) the topic (the given, what the sentence is about) would be prīma ēloquentiae virtūs, the comment (new information, the answer to the question) perspicuitās. The topic precedes the comment, as is usual. The genitive precedes the abstract noun, which is again normal, though
virtūs ēloquentiae prīma might actually be the more basic underlying order, with
prīma being fronted as more prominent - only this time the fronting concerns the noun phrase only instead of the whole sentence (this is still a debated subject in linguistics).
est comes at the end (and can easily be left out) because there's an intonational break between
perspicuitās and the rest of the sentence - intonation rises or remains raised up to
virtūs, and falls on that word as being the comment. If you wish to give the comment more prominence, you front it (together with
est) and get a structure analogous to 'It's
clarity that's the main virtue of eloquence', 'It's
John whom I saw', only Latin doesn't need the dummy subject
it's...that ('
John is whom I saw') or even the
is whom structure:
Gāium Gāia vīdit, nōn Jūlium.