Placeholders in Hebrew

מוערך בקירוב 1900

Using your date parser the above entry works for me.

In your screenshot I can’t see the word before “1815” in your code.

I’ll try to test date ranges and spans next.

־מ 1900 עד 1901

This fails for me. I’ll investigate.

If you indent the init_strings method and remove the lines from the “compiles regular expression strings for matching dates” comment until the end of the class, then it works.

Create pull request #1517: Add Hebrew datehandler

2 Likes

Grete! Thanks @Nick-Hall, I’ll test it some more just to make sure all combinations are covered. ( i an not sure but does the above pr include the __init__.py also?)

־מ 1900 עד 1901
looks like a typo anyway …the ‘־’ should have been between the ‘מ’ and the digit like so ‘מ־1900 עד 1901’
no biggie

I made two changes so that it passed the unit tests.

  1. Fixed the problem with spans. This just made the strings consistent.
  2. Added an extra estimated string.

The __init__.py file has been modified.

@Nick-Hall
Well, this is much better (should have got translating that chapter much sooner…) and it works!
here is what I’ve come up withe running all possible combos (pretty short list). It looks like we are back to RegEx trying manipulating the strings?

  • Period+qual - works - משוער מ־_6 ינואר 1875 עד ינואר 1876 - need to remove that extra space marked
  • About+qual - works - מחושב בסביבות 6 ינואר 1875 - need to ts ‘ב־’ and ‘ל’ to day and month accordingly
  • From+qual - works - מחושב מ 6 ינואר 1875 - need to ts ‘ב־’ and ‘ל’ to day and month accordingly
  • To - Failes - מחושב אל 6 ינואר 1875 - need to ts ‘ב־’ and ‘ל’ to day and month accordingly. the translation for ‘To’ in Hebrew might have mot then one usage: it can be ‘אל’ or ‘עד’ or ‘ל’ all depended on the part of the sentence or context, in this case it shouldn’t be ‘אל’ but ‘עד’ (if it is pulling it from the main .po file wouldn’t it cause a problem on discrepancies with _date_he/py?)
  • Period+qual - Fails - משוער מ־ 6 ינואר 1875 עד 0 - when no date2 input, probably none he related bur more general (shouldn’t date2 be mandatory in this case?)

When should each form of “to” be used?

@Nick-Hall @yaron

An independent “To” in date context (not in a from…to string) should look like:
עד ה־6 לינואר 1875
עד ליוני 1875
עד ל־1875
or
עד לשנת 1875

The independent “From”
מה־6 ליוני 1875
מיוני 1875
מ־1875
or
משנת 1875

The “Period”
מה־6 ליוני 1975 ועד ה־18 למרץ 1875
מיוני 1879 עד אוגוסט
מ־1875 עד 1786
or
משנת 1875 עד שנת 1876

“From… To…”
בין ה־6 ליוני 1975 לבין ה־18 למרץ 1875
בין 1879 לבין אוגוסט
בין 1875 ל־1786
or
בין שנת 1875 לשנת 1876

“Normal” dates
ה־6 ליוני 1875 or ב־6 ליוני 1875 (when right after [qual] “About” it should use ה־6, unless i can find a better translation for “About” which behaves in Hebrew like “calculated” and “estimated”)
ביוני 1875
ב־1875
or
בשנת 1875

1 Like

@Nick-Hall @yaron

I’ve changed on Weblate the Hebrew phrase for “About” from “בסיבות” to “מקורב”. it does not have the exact meaning, but close enough (מקורב means approximation) so we can now LOOSE this Part:
“(when right after [qual] “About” it should use ה־6, unless i can find a better translation for “About” which behaves in Hebrew like “calculated” and “estimated”).”
To not have the date handler completely depend on a “publicly shared translation” we need to make tat part in the _date_he.py work too. What do you think?

1 Like

There are some mistakes here.
In the first sentence you’ve added ה before the date, this is wrong, dates are already considered “known” (מיודעים), this is why it’s unnecessary to add ה״א הידיעה before a date, also relates to compounds of ב+ה etc. which in that case only the ב remains in the original form (it doesn’t transform to “BA” only “BE”).
In the 2 consecutive sentences you added ל before the actual date, it’s unnecessary, in Hebrew we use עד יוני 1875, the additional letter can be dismissed.

Same mistake as before, ה shouldn’t be there, this is relevant to all other occurrences.

The dates should be denoted as ‏6 ביוני 1875, the form ‎‏6 ליוני 1875 is incorrect.

All these rules are portrayed in the following article.

What about the term בערך which is way more native than these 2 options?

I guess we need to present a correct use case to make sure it’s handled according to the language rules.

Thanks :slight_smile:

1 Like

@avma @yaron Can we summarise the rules?

Am I correct in saying that for the “TO” case you want the following?

עד 6 לינואר 1875
עד יוני 1875
עד 1875

This is what we already have, with the addition of the “ל” before the month.

The “FROM” case is similar:

מ־6 בינואר 1875
מיוני 1875
מ־1875

The month prefix becomes “ב”, and the “מ” is a prefix so we add a maqaf where appropriate.

Do similar rules apply to “BEFORE” and “AFTER”?
Do we prefix a short month in the same way as a long month?

[/quote]
There are some mistakes here.
In the first sentence you’ve added ה before the date,

Same mistake as before, ה shouldn’t be there, this is relevant to all other occurrences.

Dear “Professor” @yaron,
I must implore you to reserve such astute observations as “This is a mistake” for your exclusive inner sanctum of confidants, particularly when they stem from such flimsy foundations.

The interpretation is flexible; it’s a linguistic choice. The format “ה־4 ביולי” was borrowed into Hebrew from foreign languages. While not endorsed by the Hebrew Language Academy, its usage remains dual.
As long as it is consistent throughout the system, either works for me.

[/quote]
What about the term בערך which is way more native than these 2 options?
[/quote]

2 stepper fields: date type, and date quality. with this word we might end up with some thing like

About About 1984

@avma I’ve added my sources, you can see that your examples are wrong, I don’t see why you are so angry, it happens, we all live by some flexible lingual rules until we meet them for higher purpose and then we have to face our assumptions, I did it many times before, this is how I know the rules and developed an agenda for and against them.
This is an inappropriate way to address me, you can be mad at anyone you like but refrain from calling people names.

Is there a way to exclude Hebrew in that case?

This is mostly correct except for the ל before the month name, it used to be the standard but the Hebrew Language Academy sometimes can drive us insane as you saw earlier in this post :slight_smile:

@avma regardless of this thread, refrain from using במידה ו, this is a video explaining why it is wrong:

This is a thorough explanation of this subject:
https://hebrew-academy.org.il/2019/01/10/במידה-ש־/

I understand your frustration but let’s keep it clean.

@avma @yaron What are we going to do about the Hebrew date handler?

I don’t mind helping, but I don’t speak any Hebrew. Is the choice between a formal and informal usage?

I would go for the formal approach from compatibility perspective but if @avma doesn’t share my views we can go his way.

Hi, @Nick-Hall You sure have bigger fish to fry and quit enough of it too. hence a little “ה” should be the least of them all.
As mentions B4 “As long as it is consistent throughout the system, either works for me.” formal or informal. But what bugs me is that I still can’t have a working, all combination, situation. (some works, other don’t)

2 Likes