[offline dictionary] Ramshorn, 1860

LCF

a.k.a. Lucifer
Do you have a Latin to Latin cleanly digitized?
 

LCF

a.k.a. Lucifer
Are you able to generate a format like this?

This is a L&S Elemetary:

**abāctus** ⇾ driven away, driven off: nox abacta, driven back (from the pole), i. e. already turned towards dawn, V.: abacta nullā conscientiā, restrained by, H.
**abacus** ⇾ a table of precious material for the display of plate, C.; luv.
**abaliēnātiō** ⇾ a transfer of property, sale, cession, C.
**abaliēnō** ⇾ to convey away, make a former transfer of, sell, alienate: agros vectigalīs populi R.: pecus. — Fig., to separate, remove, abstract: ab sensu rerum animos, abstracted their thoughts from, L.: deminuti capite, abalienati iure civium, deprived of, L. — In partic., to alienate, estrange, make hostile, render disaffected: abalienati scelere istius a nobis reges, from us, by his wickedness: aratorum numerum abs te: periurio homines suis rebus, N.: totam Africam, estrange, N.
**Abantēus** ⇾ of Abas (king of Argos): Argi, O.
**Abantiadēs** ⇾ a son or descendant of Abas (king of Argos), O.
**abavus** ⇾ a grandfather's grandfather, C.; an ancestor (rare), C.
**abcīdō**
**Abdēra** ⇾ a town of Thrace, proverbial for narrow-minded people, C., L.
**abdicātiō** ⇾ a formal laying down, voluntary renunciation, abdication: dictaturae, L.
**abdicō** ⇾ to disown, disavow, reject: ubi plus mali quam boni reperio, id totum abdico atque eicio: abdicari Philippum patrem, Cu. — With se and abl, to give up an office before the legal term expires, resign, abdicate (cf. depono, to lay down an office at the expiration of the term): dictaturā se abdicat, Cs.: se consulatu: respondit aedilitate se abdicaturum, L. — Once absol. (of consuls), to abdicate, resign, C. — With acc: abdicato magistratu, S.: causa non abdicandae dictaturae, L.
**abdīcō** ⇾ to forbid by an unfavorable omen, reject (opp. addico), C.
**abditus** ⇾ hidden, concealed, secret: virgo, locked up, H.: sub terram: ne ea omnia... ita abdita latuisse videantur, ut, etc., hidden beyond discovery: copias abditas constituunt, in ambush, Cs.: secreta Minervae, mysterious, O.: latet abditus agro, hidden in, H.: (sagitta) abdita intus Spiramenta animi rupit, buried, V. — As subst n., hidden places, Ta.: abdita rerum (a Greek idiom for abditae res), abstruse matters, H.
**abdō** ⇾ to put away, remove, set aside: impedimenta in silvas, Cs.; often with se, to go away, betake oneself: se in contrariam partem terrarum: se in Menapios, to depart, Cs.: se domum. — Praegn., to hide, conceal, put out of sight, keep secret: amici tabellas: pugnare cupiebant, sed abdenda cupiditas erat, L.: sese in silvas, Cs.: se in tenebris: ferrum in armo, O.: alqm intra tegimenta, Cs.: abdito intra vestem ferro, L.: ferrum curvo tenus hamo, up to the barb, O.: argentum Abditum terris, H.: caput casside, to cover with, O.: voltūs frondibus, O.: hunc (equum) abde domo, let him rest, V.: se litteris: lateri ensem, buried, V.: sensūs suos penitus, Ta.
**abdōmen** ⇾ the belly, abdomen: abdomine tardus, unwieldy, Iuv. — Fig., gluttony, greed: insaturabile: abdominis voluptates.
**abdūcō** ⇾ (imper. sometimes abdūce, T.), to lead away, take away, carry off, remove, lead aside: filiam abduxit suam, has taken away (from her husband), T.: cohortes secum,
 
Are you able to generate a format like this?
From what source? I will be glad to help, if you say more about the result you wish to achieve.

I think the simplest way everyone could do by themselves is:
  1. use PyGlossary for translating a dictionary to plain text (choose "Tabfile" for output format)
  2. strip xml tags from the output in a text editor (replace "<.*?>" in regexp mode)
Nontrivial transformation would require some programming. We produce XDXF files to every our dictionary especially for interoperability. It's a text format, simple and well documented.
 

LCF

a.k.a. Lucifer
Are you able to generate a format like this?

From what source?
From all the dictionaries that you have.

I will be glad to help, if you say more about the result you wish to achieve.
Given a word, I am trying to capture its meaningful associations from a multitude of sources.

**abāctus** ⇾ driven away, driven off: nox abacta, driven back (from the pole), i. e. already turned towards dawn, V.: abacta nullā conscientiā, restrained by, H.
**abacus** ⇾ a table of precious material for the display of plate, C.; luv.
**abaliēnātiō** ⇾ a transfer of property, sale, cession, C.
**abaliēnō** ⇾ to convey away, make a former transfer of, sell, alienate: agros vectigalīs populi R.: pecus. — Fig., to separate, remove, abstract: ab sensu rerum animos, abstracted their thoughts from, L.: deminuti capite, abalienati iure civium, deprived of, L. — In partic., to alienate, estrange, make hostile, render disaffected: abalienati scelere istius a nobis reges, from us, by his wickedness: aratorum numerum abs te: periurio homines suis rebus, N.: totam Africam, estrange, N.
**Abantēus** ⇾ of Abas (king of Argos): Argi, O.

This is a fragment from an Elem. L&S which I generate from the XML. I'll use it as an example.

Let's take a look at abacus:

**abacus** ⇾ a table of precious material for the display of plate, C.; luv.

One way to capture information about it from this dictionary is to count how many times it appears and co-appears with its description words and then normalize it.

This new represantation captures some notation of a meaning of abacus.

abacus [1, 6]
{
table: 1,
precious: 0.994009,
material: 0.991026973,
display: 0.982134461213543,
plate: 0.976250493656412,
}

Not a precise meaning. But some information about abacus seems to emerge.

We can think of it as follows:

"There is more chance that abacus is a table than it is a plate. Even though a plate is related to abacus somehow."

It's a small peace of information but it is still information.

Let's look at a few entries more but from the point of view of a word that appears in the description.

Alexander appears 2 times for effulgeō & Philippus

**effulgeō** ⇾ to shine out, gleam forth, flash out: nova lux ocul is effulsit, V.: Faleriis ingens lumen effulsisse, L.: auro, V. — Fig.:
effulgebant Philippus ac Alexander, L.: audaciā aut insignibus effulgens, Ta.

**Philippus** ⇾ a king of Macedonia, father of Alexander the Great, C., N. — A gold coin struck by King Philip, H.


Alexander [2, 2]
{
Philippus: 1,
effulgeo: 0.94735521379144
}
Not much to say about Alexander. It is related somehow to Philippus. Mayeb the entry for Philippus will have more information.

Philippus [6, 15]
{
king: 1,
Macedonia: 0.994009,
father: 0.991026973,
Alexander: 0.985089730404757,
Great: 0.979188057829902,
gold: 0.96749057161807,
theatrum: 0.96749057161807,
coin: 0.964588099903216,
struck: 0.961694335603506,
King: 0.955932824838906,
Philip: 0.953065026364389,
effulgeo: 0.938854569879498,
spes: 0.845140405446171,
verto: 0.499552998365099,
accedo: 0.360042547541048
}

Yeah, true. We can see that Philippus is probably a king and a father and somehow ralted to Macedonia and Alexander.

But what about Alexander itself?

Alexander [2, 2]
{
Philippus: 1,
effulgeo: 0.94735521379144
}

Again, it is related somehow to Philippus and that's all the information we were able to capture directly.

Asking a machine:

"quis est Alexander?" shouldn't give a table as an answer, right?

Instead it should be able to answer with "Alexander Magnus, rex Macedoniae, natus Phillippi..." or some such.

I need a different representation of Alexander. So that it captures some information from Philippus as well.

In the Efficient Estimation of Word Representations in Vector Space, Mikolov et al., the information is captured by reading lots of text and counting which words
appear with other word more often. You can visualize Alexander here://azret.github.io/latin/#alexander

Training vectors using the same technique but only from a single dictionary like L&S will produce a better representation of Alexander:

This is after just 10 minutes of training...

Alexander : 1
effulsit : 0.887274080840863
effulgebant : 0.874044220052225
effulsisse : 0.865945801692241
Faleriis : 0.853887546569713
effulgens : 0.846735124811042
King : 0.83131996326551
effulgeo : 0.749846446235245
Philippus : 0.719970844785784
belo : 0.625529144941858
Fele : 0.619990437805543
insignibus : 0.595397827052992
Philippus : 1
King : 0.765685538780663
Alexander : 0.719970844785784
effulsit : 0.650588264337988
effulgeo : 0.607649944781956
vincire : 0.543377126757282
Macedonia : 0.527472171116464
frugalitas : 0.524689778651647
volucris : 0.522859041348755
ceram : 0.521628085403608
cognata : 0.520728029161825
attach : 0.516165540894491
castrum : 0.515595690742273
divino : 0.515050999142493
Philip : 0.514014026462802


From what source?
From all the dictionaries that you have. If you can give me those dictionaries in the format like above. I can train those vectors and we can see what they capture.

 
So, you need them for Word Embedding. Don't you think that using dictionary articles your dataset does not represent real statistical distribution of the words? May be it would be better to take authentic Latin literature, such as "De Bello Gallico". There are numerous transcribed Latin texts on the Project Gutenberg's website.

I'll take your request into consideration, but not at high priority, because the above described processing is trivial. Do you need only English definitions of the words, or Latin, or mix of them?
 

LCF

a.k.a. Lucifer
So, you need them for Word Embedding...

Word and syllable embedding. Like https://fasttext.cc/. but instead of character n-grams I am trying syllable n-grams.

May be it would be better to take authentic Latin literature...
I make different embeddings from different sources. And see how they compare.

Do you need only English definitions of the words, or Latin, or mix of them.
mix/both does not mater. As long as they entire entry is on the same line with out the need to parse it out of any proprietary dictionary format.

I use WHITAKER for lematization. Attached is a fully exploded inflection table.
 

Attachments

Those are images though. I was hoping for a digitized version.
My answer would be offtopic, so I moved it here.

I remember about your interest, but currently have nothing to provide you. May be the next full-text dictionaries could be useful for you, if French is acceptable in your research:
  • Dictionnaire latin-français (Gaffiot, 1934)
  • Dictionnaire français-latin (Édon, ?)
They are encoded in StarDict format, so with PyGlossary you could easily transform them into CSV.
 
  • Like
Reactions: LCF

LCF

a.k.a. Lucifer
Are these duplicates?

abgregoAbgregare est a grege, Adgregare ad gregem ducere, Segregare ex pluribus gregibus partes seducere, unde et Egregius dicitur e grege lectus. Fest. R.
adgregoAbgregare est a grege, Adgregare ad gregem ducere, Segregare ex pluribus gregibus partes seducere, unde et Egregius dicitur e grege lectus. Fest. R.
segregoAbgregare est a grege, Adgregare ad gregem ducere, Segregare ex pluribus gregibus partes seducere, unde et Egregius dicitur e grege lectus. Fest. R.
 
Top