From mstahl@redhat.com Fri Jun 28 17:45:02 2013
Return-path: <mstahl@redhat.com>
Envelope-to: lionel@mamane.lu
Delivery-date: Fri, 28 Jun 2013 17:45:02 +0200
Received: from agate.conuropsis.org ([2a02:898:36:10b:e4c2::1])
	by capsaicin.mamane.lu with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256)
	(Exim 4.80)
	(envelope-from <mstahl@redhat.com>)
	id 1Usar8-0001dr-0Q
	for lionel@mamane.lu; Fri, 28 Jun 2013 17:45:02 +0200
Received: from mx1.redhat.com ([209.132.183.28])
	by agate.conuropsis.org with esmtp  (Exim 4.72)
	id 1Usar7-0005Y7-8n
	for lionel@mamane.lu; Fri, 28 Jun 2013 17:45:01 +0200
Received: from int-mx01.intmail.prod.int.phx2.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11])
	by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r5SFis9K015386
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK)
	for <lionel@mamane.lu>; Fri, 28 Jun 2013 11:44:55 -0400
Received: from [10.36.6.125] (vpn1-6-125.ams2.redhat.com [10.36.6.125])
	by int-mx01.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id r5SFiqA8031350
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Fri, 28 Jun 2013 11:44:53 -0400
Message-ID: <51CDAF74.2020803@redhat.com>
Date: Fri, 28 Jun 2013 17:44:52 +0200
From: Michael Stahl <mstahl@redhat.com>
Organization: Red Hat
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130612 Thunderbird/17.0.6
MIME-Version: 1.0
To: Eike Rathke <erack@redhat.com>, Lionel Elie Mamane <lionel@mamane.lu>,
        Stephan Bergmann <sbergman@redhat.com>
Subject: TimeZone questions
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
X-Scanned-By: MIMEDefang 2.67 on 10.5.11.11
Status: RO
Content-Length: 2287


https://gerrit.libreoffice.org/#/c/4608/

i have now a patch that adds time zone to various UNO structs
(Date/Time/DateTime/DateTimeRange), and now wonder if we really need it.

options for handling time zones:

1) store time zones in UNO structs
   - add Option<short> TimeZone to 4 structs
   - where relevant read and store the TimeZone field
   - all code that handles the structs may need to be aware of time
     zones and handle them
   - no strict ordering of all values possible

2) convert all times to UTC on import
   - add IsTimeZoned flag to UNO structs
   - set that flag accordingly depending on whether a read
     value has a known time zone or not
   - when writing the value add time zone or not depending on flag
   - use that in comparison operators (we don't really know which
     time zone a value with !IsTimeZoned is in so it's not possible to
     compare it)
   - no strict ordering of all values possible
   - actually there's a little problem: we cannot convert Date because
     it's actually a 24 hour interval and cannot be zone-adjusted and
     so it is not possible to answer "is this time-zoned DateTime on
     this time-zoned Date" with this approach...

3) same as 2) except convert all times to _local_ time zone on import

4) _assume_ that values without explicit time zone are in local time
   - same as 3) except...
   - this would not need additional IsTimeZoned flag per assumption
   - all Date/Time values can be strictly ordered
   - when writing the value, either have to always write the local
     time zone, or always no time zone... actually with the additional
     IsTimeZoned flag the difference could be preserved

effectively 4) is the same as the status quo, except that on import the
times are converted if they have explicit time zone.

the real question is, is there any use-case for _preserving_ an existing
time-zone that is read from somewhere; only option 1) can do that.

and is there any use case for _correctly_ comparing Dates with
DateTimes; again only option 1) can do this.

could i get some opinions on this until Monday?

PS: also i remember telling Lionel yesterday that there are
automatically generated operator== for UNO structs, which is total
nonsense, cppumaker does not generate comparison operators


From erack@redhat.com Mon Jul  1 21:02:51 2013
Return-path: <erack@redhat.com>
Envelope-to: lionel@mamane.lu
Delivery-date: Mon, 01 Jul 2013 21:02:51 +0200
Received: from agate.conuropsis.org ([2a02:898:36:10b:e4c2::1])
	by capsaicin.mamane.lu with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256)
	(Exim 4.80)
	(envelope-from <erack@redhat.com>)
	id 1UtjNC-0007JK-U6
	for lionel@mamane.lu; Mon, 01 Jul 2013 21:02:51 +0200
Received: from mx1.redhat.com ([209.132.183.28])
	by agate.conuropsis.org with esmtp  (Exim 4.72)
	id 1UtjNC-0003lU-5o
	for lionel@mamane.lu; Mon, 01 Jul 2013 21:02:50 +0200
Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24])
	by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r61J2kPI026367
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK)
	for <lionel@mamane.lu>; Mon, 1 Jul 2013 15:02:46 -0400
Received: from localhost (ovpn01.gateway.prod.ext.ams2.redhat.com [10.39.146.11])
	by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id r61J2ieU017206
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO);
	Mon, 1 Jul 2013 15:02:45 -0400
Date: Mon, 1 Jul 2013 21:02:43 +0200
From: Eike Rathke <erack@redhat.com>
To: Michael Stahl <mstahl@redhat.com>
Cc: Lionel Elie Mamane <lionel@mamane.lu>,
        Stephan Bergmann <sbergman@redhat.com>
Subject: Re: TimeZone questions
Message-ID: <20130701190243.GC10488@isigqoko.erack.de>
References: <51CDAF74.2020803@redhat.com>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha256;
	protocol="application/pgp-signature"; boundary="at6+YcpfzWZg/htY"
Content-Disposition: inline
In-Reply-To: <51CDAF74.2020803@redhat.com>
X-Accept-Language: de,en
X-Nickname: erAck
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Scanned-By: MIMEDefang 2.68 on 10.5.11.24
Status: RO
X-Status: A
Content-Length: 4868


--at6+YcpfzWZg/htY
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Hi Michael,

On Friday, 2013-06-28 17:44:52 +0200, Michael Stahl wrote:

> https://gerrit.libreoffice.org/#/c/4608/

See also my reply there with awkwardly destroyed quoting by gerrit..

> i have now a patch that adds time zone to various UNO structs
> (Date/Time/DateTime/DateTimeRange), and now wonder if we really need it.
>=20
> options for handling time zones:
>=20
> 1) store time zones in UNO structs
>    - add Option<short> TimeZone to 4 structs
>    - where relevant read and store the TimeZone field
>    - all code that handles the structs may need to be aware of time
>      zones and handle them
>    - no strict ordering of all values possible

Why no strict ordering possible? The current local timezone could be
assumed in ordering, respectively all converted to UTC.


> 2) convert all times to UTC on import

No. As already mentioned elsewhere and in my comment to the gerrit
change above, the document content of times is not expected to change
when loaded in a different timezone.

>    - add IsTimeZoned flag to UNO structs
>    - set that flag accordingly depending on whether a read
>      value has a known time zone or not
>    - when writing the value add time zone or not depending on flag
>    - use that in comparison operators (we don't really know which
>      time zone a value with !IsTimeZoned is in so it's not possible to
>      compare it)
>    - no strict ordering of all values possible
>    - actually there's a little problem: we cannot convert Date because
>      it's actually a 24 hour interval and cannot be zone-adjusted and
>      so it is not possible to answer "is this time-zoned DateTime on
>      this time-zoned Date" with this approach...
>=20
> 3) same as 2) except convert all times to _local_ time zone on import

No for the same reason.


> 4) _assume_ that values without explicit time zone are in local time
>    - same as 3) except...
>    - this would not need additional IsTimeZoned flag per assumption

Yes it would, because for new data we may want to distiguish between
timezoned and un-timezoned, for example document properties last edited
and such could be created/written with timezone but would need to be
read un-timezoned if not present.

>    - all Date/Time values can be strictly ordered

We'd still have a mix of timezoned and un-timezoned, where is the
difference?

>    - when writing the value, either have to always write the local
>      time zone, or always no time zone... actually with the additional
>      IsTimeZoned flag the difference could be preserved
>=20
> effectively 4) is the same as the status quo, except that on import the
> times are converted if they have explicit time zone.

Otherwise sounds good and a path to introduce timezones without breaking
current handling.


> the real question is, is there any use-case for _preserving_ an existing
> time-zone that is read from somewhere; only option 1) can do that.

Also option 4) if I didn't misread, as you wrote "values without
explicit time zone", which implies values with explicit timezone are to
be treated different.

> and is there any use case for _correctly_ comparing Dates with
> DateTimes; again only option 1) can do this.

I don't get this, for 1) you wrote that strict ordering would not be
possible.

However, in practice when comparing with DateTime I'd assume that a Date
has no timezone information, or if it has convert it to the same
timezone of the DateTime to be compared against. There may be exceptions
lurking..

  Eike

--=20
LibreOffice Calc developer. Number formatter stricken i18n transpositionize=
r.
GPG key ID: 0x65632D3A - 2265 D7F3 A7B0 95CC 3918  630B 6A6C D5B7 6563 2D3A
For key transition see http://erack.de/key-transition-2013-01-10.txt.asc
Support the FSFE, care about Free Software! https://fsfe.org/support/?erack

--at6+YcpfzWZg/htY
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)

iQIcBAEBCAAGBQJR0dJSAAoJEGps1bdlYy06yMYQAKH+GNa/+k/q7UoNsEbANlWb
fZTRO1ERmADScVPQEewOzKgQFwVO7ttJbEAXlizhqVfs8DUJX8IfMfmezGfP5Fgk
AtJTwrzE/1yl3+3ooT7XPnRP36H4Xj5EChrwIo5klb2gyIMx0u9nuv3G6UC7ehg1
cqR7U5URbXQ3kGe5ytlwwuCJ6amRqWLBFmByfLscRCqG2zJ8jekJtxXW1HXnV3BA
mNe+dvHbsyGuPQe6085ipXw2WGljH8jzyHRfen4Zzx7qceaOPvhptwSexuPQhlQx
nAFt4j6oheIYMR3GV+Dq/S9n+kWTumabrrjjbrOj8HbOCV5LF6pHuqV2OWKUv0n3
EdqKP7k/6H7nPYLB/KWzzPnf65WW/16R9vN3uKzprI76wXU/lR5i+r3PojMHasFl
q+7WAU1yS/WiUx+BUa57mgVy0GsCmC8iVx/SWimM9fVWZmeXzb6aeWqyw338vx6M
kOfqwJ6MLH+G2knYUkj4elQQfdFr4aAlpo+LrdZ0TBGzTtCizLgfFGT5uIH0k7x0
gkt8mfpc8LNcLI9Xe9F6MAznMZvWes+0Ng0ogg5VG7YJvpEfqCa0uDbiTdFjWIB/
c7m+aNvmo1+yOhIZimbhk8xgnWmz96EqcZUgUP4kN+kzTv8dg4veJtE5GmvMUfYQ
ewXO8RGhFARsQ75y8bpW
=EChx
-----END PGP SIGNATURE-----

--at6+YcpfzWZg/htY--


From mstahl@redhat.com Mon Jul  1 21:34:28 2013
Return-path: <mstahl@redhat.com>
Envelope-to: lionel@mamane.lu
Delivery-date: Mon, 01 Jul 2013 21:34:28 +0200
Received: from agate.conuropsis.org ([2a02:898:36:10b:e4c2::1])
	by capsaicin.mamane.lu with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256)
	(Exim 4.80)
	(envelope-from <mstahl@redhat.com>)
	id 1Utjro-0007Lz-Mq
	for lionel@mamane.lu; Mon, 01 Jul 2013 21:34:28 +0200
Received: from mx1.redhat.com ([209.132.183.28])
	by agate.conuropsis.org with esmtp  (Exim 4.72)
	id 1Utjrn-0003nS-AX
	for lionel@mamane.lu; Mon, 01 Jul 2013 21:34:27 +0200
Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23])
	by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r61JYNQB011765
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK)
	for <lionel@mamane.lu>; Mon, 1 Jul 2013 15:34:23 -0400
Received: from [10.36.7.74] (vpn1-7-74.ams2.redhat.com [10.36.7.74])
	by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id r61JYJRG013602
	(version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO);
	Mon, 1 Jul 2013 15:34:21 -0400
Message-ID: <51D1D9BB.60308@redhat.com>
Date: Mon, 01 Jul 2013 21:34:19 +0200
From: Michael Stahl <mstahl@redhat.com>
Organization: Red Hat
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130612 Thunderbird/17.0.6
MIME-Version: 1.0
To: Eike Rathke <erack@redhat.com>
CC: Lionel Elie Mamane <lionel@mamane.lu>,
        Stephan Bergmann <sbergman@redhat.com>
Subject: Re: TimeZone questions
References: <51CDAF74.2020803@redhat.com> <20130701190243.GC10488@isigqoko.erack.de>
In-Reply-To: <20130701190243.GC10488@isigqoko.erack.de>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
X-Scanned-By: MIMEDefang 2.68 on 10.5.11.23
Status: RO
Content-Length: 5165

On 01/07/13 21:02, Eike Rathke wrote:
> Hi Michael,
> 
> On Friday, 2013-06-28 17:44:52 +0200, Michael Stahl wrote:
> 
>> https://gerrit.libreoffice.org/#/c/4608/
> 
> See also my reply there with awkwardly destroyed quoting by gerrit..
> 
>> i have now a patch that adds time zone to various UNO structs
>> (Date/Time/DateTime/DateTimeRange), and now wonder if we really need it.
>>
>> options for handling time zones:
>>
>> 1) store time zones in UNO structs
>>    - add Option<short> TimeZone to 4 structs
>>    - where relevant read and store the TimeZone field
>>    - all code that handles the structs may need to be aware of time
>>      zones and handle them
>>    - no strict ordering of all values possible
> 
> Why no strict ordering possible? The current local timezone could be
> assumed in ordering, respectively all converted to UTC.

but with such assumption it would no longer be 1) but a lot more like 4)

let's call it 5) Option<short> TimeZone + assumption of local time

>> 2) convert all times to UTC on import
> 
> No. As already mentioned elsewhere and in my comment to the gerrit
> change above, the document content of times is not expected to change
> when loaded in a different timezone.
> 
>>    - add IsTimeZoned flag to UNO structs
>>    - set that flag accordingly depending on whether a read
>>      value has a known time zone or not
>>    - when writing the value add time zone or not depending on flag
>>    - use that in comparison operators (we don't really know which
>>      time zone a value with !IsTimeZoned is in so it's not possible to
>>      compare it)
>>    - no strict ordering of all values possible
>>    - actually there's a little problem: we cannot convert Date because
>>      it's actually a 24 hour interval and cannot be zone-adjusted and
>>      so it is not possible to answer "is this time-zoned DateTime on
>>      this time-zoned Date" with this approach...
>>
>> 3) same as 2) except convert all times to _local_ time zone on import
> 
> No for the same reason.
> 
> 
>> 4) _assume_ that values without explicit time zone are in local time
>>    - same as 3) except...
>>    - this would not need additional IsTimeZoned flag per assumption
> 
> Yes it would, because for new data we may want to distiguish between
> timezoned and un-timezoned, for example document properties last edited
> and such could be created/written with timezone but would need to be
> read un-timezoned if not present.

right it would, as i noted in the last bullet point "with ...
IsTimeZoned flag the difference could be preserved" (should have revised
this)

>>    - all Date/Time values can be strictly ordered
> 
> We'd still have a mix of timezoned and un-timezoned, where is the
> difference?

difference is the _assumption_ that un-timezoned values are local times,
whereas with 1) 2) 3) they may be in _any_ timezone with means it could
be any of 29 time-zoned values and we don't know which so it's not
possible to say if it is before or after any timezoned value within
those 28 hours.

>>    - when writing the value, either have to always write the local
>>      time zone, or always no time zone... actually with the additional
>>      IsTimeZoned flag the difference could be preserved
>>
>> effectively 4) is the same as the status quo, except that on import the
>> times are converted if they have explicit time zone.
> 
> Otherwise sounds good and a path to introduce timezones without breaking
> current handling.

? you said above you don't want any times to be converted, which 4) also
does (exactly same way as 3));

>> the real question is, is there any use-case for _preserving_ an existing
>> time-zone that is read from somewhere; only option 1) can do that.
> 
> Also option 4) if I didn't misread, as you wrote "values without
> explicit time zone", which implies values with explicit timezone are to
> be treated different.

no what i mean is that if you have 00:00+01:00 in a file and you load it
in TZ +02:00 then with 1) you can at the end write out 00:00+01:00 again
but with 3) 4) you only have a boolean flag and the time has been
converted to local time and you have to write a value in TZ +02:00 like
01:00+02:00 (or perhaps it's 23:00:+02:00, i forget).

>> and is there any use case for _correctly_ comparing Dates with
>> DateTimes; again only option 1) can do this.
> 
> I don't get this, for 1) you wrote that strict ordering would not be
> possible.

yes, that's true.

but with 1) all values that have TZ _can_ be strictly ordered.

with 2) 3) 4) Date cannot be because if it only has a bool flag it's not
possible to convert it to a different timezone.

ok, there is a "hybrid" option of bool flag in
DateTime/DateTimeRange/Time and "Option<short>" in Date, that is (i
think) the least number of bits added to get everything to work  :-/

> However, in practice when comparing with DateTime I'd assume that a Date
> has no timezone information, or if it has convert it to the same
> timezone of the DateTime to be compared against. There may be exceptions
> lurking..

well with XMLSchema-2 it's possible to have a date with a timezone and a
dateTime with a different timezone.


From lionel@mamane.lu Wed Jul  3 10:59:44 2013
Date: Wed, 3 Jul 2013 10:59:44 +0200
From: Lionel Elie Mamane <lionel@mamane.lu>
To: Eike Rathke <erack@redhat.com>
Cc: Michael Stahl <mstahl@redhat.com>,
	Stephan Bergmann <sbergman@redhat.com>
Subject: Re: TimeZone questions
Message-ID: <20130703085944.GA8315@capsaicin.mamane.lu>
References: <51CDAF74.2020803@redhat.com>
 <20130701190243.GC10488@isigqoko.erack.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20130701190243.GC10488@isigqoko.erack.de>
X-Operating-System: GNU/Linux
X-Request-PGP: http://www.mamane.lu/openpgp/rsa_v4_4096.asc
User-Agent: Mutt/1.5.21 (2010-09-15)
Status: RO
X-Status: A
Content-Length: 2340

On Mon, Jul 01, 2013 at 09:02:43PM +0200, Eike Rathke wrote:
> On Friday, 2013-06-28 17:44:52 +0200, Michael Stahl wrote:

>> I have now a patch that adds time zone to various UNO structs
>> (Date/Time/DateTime/DateTimeRange), and now wonder if we really need it.

>> options for handling time zones:

>> 2) convert all times to UTC on import

> No. As already mentioned elsewhere and in my comment to the gerrit
> change above, the document content of times is not expected to
> change when loaded in a different timezone.

My understanding of what Michael meant was "convert all timezoned
times to UTC". Times without timezone (neither explicit, nor implicit
by the file format) stay untouched, with the "IsTimeZoned" flag set to
false.

>> 3) same as 2) except convert all times to _local_ time zone on import

> No for the same reason.

Same remark. This is about the internal storage and API interface data
type of *timezoned* values *only*. Do we have them as:

 - hour, minute, second, fraction_of_second and explicit offset

 - hour, minute, second and fraction_of_second in UTC

 - hour, minute, second and fraction_of_second in localtime

I oppose the third solution for another reason: the local time zone
changes twice per year (in many/most locations). A long-running
LibreOffice process will get it wrong. Madness.

>> the real question is, is there any use-case for _preserving_ an existing
>> time-zone that is read from somewhere; only option 1) can do that.

> Also option 4) if I didn't misread, as you wrote "values without
> explicit time zone", which implies values with explicit timezone are to
> be treated different.

There is a misunderstanding. The question being discussed is not the
difference between timezoned data and untimezoned data.

The discussion question is whether we want to make the
difference between 05:12:30.541+01 and 06:12:30.541+02? That is,
if a file contains 05:12:30.541+01, is it OK if when rewriting
the file we write any one of:

 04:12:30.541Z
 04:12:30.541+00
 05:12:30.541+01
 06:12:30.541+02
 07:42:30.541+0230

OR do we have to write one of these:

 05:12:30.541+01
 05:12:30.541+0100
 05:12:30.541000+01
 05:12:30.541000000+01
 051230.541000000+01


If you tell me that one of
 05:12:30.541+0100
 05:12:30.541000000+01
is not acceptable, then we have a bigger problem.

-- 
Lionel

From lionel@mamane.lu Wed Jul  3 11:33:00 2013
Date: Wed, 3 Jul 2013 11:33:00 +0200
From: Lionel Elie Mamane <lionel@mamane.lu>
To: Eike Rathke <erack@redhat.com>
Cc: Michael Stahl <mstahl@redhat.com>,
	Stephan Bergmann <sbergman@redhat.com>
Subject: Re: TimeZone questions
Message-ID: <20130703093300.GA9429@capsaicin.mamane.lu>
References: <51CDAF74.2020803@redhat.com>
 <20130701190243.GC10488@isigqoko.erack.de>
 <20130703085944.GA8315@capsaicin.mamane.lu>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20130703085944.GA8315@capsaicin.mamane.lu>
X-Operating-System: GNU/Linux
X-Request-PGP: http://www.mamane.lu/openpgp/rsa_v4_4096.asc
User-Agent: Mutt/1.5.21 (2010-09-15)
Status: RO
Content-Length: 2154

On Wed, Jul 03, 2013 at 10:59:44AM +0200, Lionel Elie Mamane wrote:
> On Mon, Jul 01, 2013 at 09:02:43PM +0200, Eike Rathke wrote:
>> On Friday, 2013-06-28 17:44:52 +0200, Michael Stahl wrote:

>>> the real question is, is there any use-case for _preserving_ an existing
>>> time-zone that is read from somewhere; only option 1) can do that.

>> Also option 4) if I didn't misread, as you wrote "values without
>> explicit time zone", which implies values with explicit timezone are to
>> be treated different.

> The discussion question is whether we want to make the
> difference between 05:12:30.541+01 and 06:12:30.541+02?

The underlying assumption (by both Michael and me) is that if this is
the case, our hands are tied and we need to put a timezone member
in com.sun.star.util.Date/Time/DateTime to preserve that information.

I now realised that this is not true. We could make the choice that
css.util.* has (for timezoned values) UTC time. If some code needs
to store the information that this place in the file should, when
writing the file, get the time/datetime in zone UTC+0500, then that
code that store that information *separately* (either in another
variable or by using a pair<cssu.DateTime, sal_Int16>). When writing
the file, do something like:

Assumptions:

 pair<cssu::DateTime, short int> valueToWrite;
 ::tools::convertToTZ(cssu::DateTime, sal_Int16);

Code:

 cssu::DateTime valInRightTZ(::tools::convertToTZ(valueToWrite.first, valueToWrite.second);
 assert(valInRightTZ.isTimeZoned == false);
 fprintf(file, "%d04-%u02-%u02 %u02:%u02:%u02", valInRightTZ.Year,
         valInRightTZ.Month, valInRightTZ.Day, valInRightTZ.Hours,
	 valInRightTZ.Minutes, valInRightTZ.Seconds);

We could have a utility function in ::tools:: to do that kind of
stuff. This allows places that need it to make the difference,
without shoving explicit timezone down the throat of *all*
time-handling code of LibreOffice, allowing the time-handling code
that does not need it to stay simpler.

We could also have a new struct cssu.TimeDateWithTZ for this use
case, or maybe just an internal datatype/class/typedef (not exported
to UNO).

-- 
Lionel

From erack@redhat.com Wed Jul  3 15:17:58 2013
Return-path: <erack@redhat.com>
Envelope-to: lionel@mamane.lu
Delivery-date: Wed, 03 Jul 2013 15:17:58 +0200
Received: from agate.conuropsis.org ([2a02:898:36:10b:e4c2::1])
	by capsaicin.mamane.lu with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256)
	(Exim 4.80)
	(envelope-from <erack@redhat.com>)
	id 1UuMwY-0002s5-Pp
	for lionel@mamane.lu; Wed, 03 Jul 2013 15:17:58 +0200
Received: from mx1.redhat.com ([209.132.183.28])
	by agate.conuropsis.org with esmtp  (Exim 4.72)
	id 1UuMwX-0005rH-MV
	for lionel@mamane.lu; Wed, 03 Jul 2013 15:17:58 +0200
Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23])
	by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r63DHpXF013971
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK)
	for <lionel@mamane.lu>; Wed, 3 Jul 2013 09:17:52 -0400
Received: from localhost (ovpn01.gateway.prod.ext.ams2.redhat.com [10.39.146.11])
	by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id r63DHo2a022096
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO);
	Wed, 3 Jul 2013 09:17:51 -0400
Date: Wed, 3 Jul 2013 15:17:49 +0200
From: Eike Rathke <erack@redhat.com>
To: Lionel Elie Mamane <lionel@mamane.lu>
Cc: Michael Stahl <mstahl@redhat.com>, Stephan Bergmann <sbergman@redhat.com>
Subject: Re: TimeZone questions
Message-ID: <20130703131749.GB3086@isigqoko.erack.de>
References: <51CDAF74.2020803@redhat.com>
 <20130701190243.GC10488@isigqoko.erack.de>
 <20130703085944.GA8315@capsaicin.mamane.lu>
 <20130703093300.GA9429@capsaicin.mamane.lu>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha256;
	protocol="application/pgp-signature"; boundary="jq0ap7NbKX2Kqbes"
Content-Disposition: inline
In-Reply-To: <20130703093300.GA9429@capsaicin.mamane.lu>
X-Accept-Language: de,en
X-Nickname: erAck
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Scanned-By: MIMEDefang 2.68 on 10.5.11.23
Status: RO
X-Status: A
Content-Length: 3346


--jq0ap7NbKX2Kqbes
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Hi Lionel,

On Wednesday, 2013-07-03 11:33:00 +0200, Lionel Elie Mamane wrote:

> > The discussion question is whether we want to make the
> > difference between 05:12:30.541+01 and 06:12:30.541+02?
>=20
> The underlying assumption (by both Michael and me) is that if this is
> the case, our hands are tied and we need to put a timezone member
> in com.sun.star.util.Date/Time/DateTime to preserve that information.
>=20
> I now realised that this is not true. We could make the choice that
> css.util.* has (for timezoned values) UTC time. If some code needs
> to store the information that this place in the file should, when
> writing the file, get the time/datetime in zone UTC+0500, then that
> code that store that information *separately* (either in another
> variable or by using a pair<cssu.DateTime, sal_Int16>).

I doubt we want to change all places where times are passed to pass an
additional timezone information. Especially with UNO properties that
would mean to add yet another property to all interfaces handling times.

Also, for transporting a timezoned time that would mean the caller would
have to extract the timezone and convert the time to UTC, just so the
receiver assembles the split information into a timezoned time again.
This is unnecessary and should be avoided.

> We could have a utility function in ::tools:: to do that kind of
> stuff. This allows places that need it to make the difference,
> without shoving explicit timezone down the throat of *all*
> time-handling code of LibreOffice, allowing the time-handling code
> that does not need it to stay simpler.

With the timezone field added I don't see a difference for code not
aware of timezones. It would take the timezoned time as un-timezoned as
it did before where we now ignore a timezone when reading from file.

> We could also have a new struct cssu.TimeDateWithTZ for this use
> case

I doubt, because file reading interfaces would have to pass both anyway
in that case.

  Eike

--=20
LibreOffice Calc developer. Number formatter stricken i18n transpositionize=
r.
GPG key ID: 0x65632D3A - 2265 D7F3 A7B0 95CC 3918  630B 6A6C D5B7 6563 2D3A
For key transition see http://erack.de/key-transition-2013-01-10.txt.asc
Support the FSFE, care about Free Software! https://fsfe.org/support/?erack

--jq0ap7NbKX2Kqbes
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)

iQIcBAEBCAAGBQJR1CR9AAoJEGps1bdlYy069msP/11kWj0xdlH8LYb6p7ZVgHuF
V0c85F2w/3UTcgitwYyAb2ey0ycO+7TPvrsb/fHHQ4ZnS+bktXalzXb90xD66g0t
QtcNyr87puZouJWfzaVZvevdjOxtPcusouLq5s9W9FDwLHS5oSN/zZ+HTe4qGYGV
35/AQ1Bp4REFEwUh6hy5Ko1C9csdMS5AEsI0tv3Hq7bBjZJGxzaxTnLG1F1qTHvZ
P6jyiO3V/CFcHhX+XuT6/4SXinTRN+1j6DJadpdWkMBXs6htHxkHa7UAtH+lXn6t
M09EaS1EDLYDG/jwOUhbGmBVQF8jNcynjbCgnUZvAl6AuTVJ0d86HCbOzF8WltFs
KJEXduV/ukLgN+XHK5DHzAUj/ZM4Gv/VKyZ3yyMcn/ubP9lQuEJ4MtH5Tuh8TA/o
2B1lOmjasVmr47vsnNzZ54xwBZfcYx5RA3js4q+kO7VYRXyB1KCWsWNWyd5L+5bP
GjsOgLHrt0eAqZvxih9kjjnkl94FsvTk9mtLcziX+iKmPb69xwKvQRyshwIoWdN2
VJ348OHrqnKlqbaKEJS8HezTg6cfc4QYRQZQlndL53NMGW/2s5zNhcIHA99WyMzP
zyWUAmjFqgrR/gcZolCS6RjH136qKlE3J3hDkg6+VeacTlFbAIbYp853/iwpu8GL
oyk55gpzaW32b8LwPlVs
=eakh
-----END PGP SIGNATURE-----

--jq0ap7NbKX2Kqbes--


From lionel@mamane.lu Wed Jul  3 16:23:05 2013
Date: Wed, 3 Jul 2013 16:23:05 +0200
From: Lionel Elie Mamane <lionel@mamane.lu>
To: Eike Rathke <erack@redhat.com>
Cc: Michael Stahl <mstahl@redhat.com>,
	Stephan Bergmann <sbergman@redhat.com>, mmeeks@suse.com
Subject: Re: TimeZone questions
Message-ID: <20130703142305.GA9559@capsaicin.mamane.lu>
References: <51CDAF74.2020803@redhat.com>
 <20130701190243.GC10488@isigqoko.erack.de>
 <20130703085944.GA8315@capsaicin.mamane.lu>
 <20130703093300.GA9429@capsaicin.mamane.lu>
 <20130703131749.GB3086@isigqoko.erack.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20130703131749.GB3086@isigqoko.erack.de>
X-Operating-System: GNU/Linux
X-Request-PGP: http://www.mamane.lu/openpgp/rsa_v4_4096.asc
User-Agent: Mutt/1.5.21 (2010-09-15)
Status: RO
Content-Length: 2836

On Wed, Jul 03, 2013 at 03:17:49PM +0200, Eike Rathke wrote:
> On Wednesday, 2013-07-03 11:33:00 +0200, Lionel Elie Mamane wrote:

>>> The discussion question is whether we want to make the
>>> difference between 05:12:30.541+01 and 06:12:30.541+02?

>> The underlying assumption (by both Michael and me) is that if this is
>> the case, our hands are tied and we need to put a timezone member
>> in com.sun.star.util.Date/Time/DateTime to preserve that information.

>> I now realised that this is not true. We could make the choice that
>> css.util.* has (for timezoned values) UTC time. If some code needs
>> to store the information that this place in the file should, when
>> writing the file, get the time/datetime in zone UTC+0500, then that
>> code that store that information *separately* (either in another
>> variable or by using a pair<cssu.DateTime, sal_Int16>).

> I doubt we want to change all places where times are passed to pass
> an additional timezone information.

I doubt that, too. *Some* places need a "point in time", and will
*not* add a timezone information to the UTC time. *Some* places will
need additional timezone information and then, well, they will need to
add it.

> Also, for transporting a timezoned time that would mean the caller would
> have to extract the timezone and convert the time to UTC, just so the
> receiver assembles the split information into a timezoned time again.
> This is unnecessary and should be avoided.

That's where a "datetime + timezone" data structure would be far more
convenient rather than "UTC datetime + timezone".

Hmm... Brainwave... Since we have an explicit notion of "(date)time in
unknown timezone", what about "datetime in unknown timezone +
timezone". Semantically, that's exactly the information.

However, I'm increasingly of the opinion there should just be at least
two distinct datatypes. In the terms of my definition on IRC:

1) legacy "some time in unknown timezone"
2) a point in global timeline (assuming Newtonian physics and its
   global time)
3) the combination of a "point in global timeline" and a region of
   earth where calendars and clocks agree.

IMO 2) and 3) should be distinct datatypes, because they are not the
same concept. They are both useful concepts, and we should provide
both to our users.

E.g. think of a calendar. The answer to the question "when is the next
ESC call" is of type 2). It is not any more "2013-07-04 15:00+00" than
it is "2013-07-04 16:00+01" or "2013-07-05 01:00+10".

You argue that within our internals and our file formats are values of
type 3); the example that came around was timetables of airplane
flights with departure and arrival times. So, OK, type 3) should be
*available* for use, but that is no reason to force everybody that
needs 2) to use 3), which is what you are advocating.

-- 
Lionel

From mstahl@redhat.com Wed Jul  3 22:40:39 2013
Return-path: <mstahl@redhat.com>
Envelope-to: lionel@mamane.lu
Delivery-date: Wed, 03 Jul 2013 22:40:39 +0200
Received: from agate.conuropsis.org ([2a02:898:36:10b:e4c2::1])
	by capsaicin.mamane.lu with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256)
	(Exim 4.80)
	(envelope-from <mstahl@redhat.com>)
	id 1UuTqx-0003mf-3O
	for lionel@mamane.lu; Wed, 03 Jul 2013 22:40:39 +0200
Received: from mx1.redhat.com ([209.132.183.28])
	by agate.conuropsis.org with esmtp  (Exim 4.72)
	id 1UuTqw-0006R1-BO
	for lionel@mamane.lu; Wed, 03 Jul 2013 22:40:38 +0200
Received: from int-mx02.intmail.prod.int.phx2.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12])
	by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r63KeX8h008049
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK);
	Wed, 3 Jul 2013 16:40:33 -0400
Received: from [10.36.7.88] (vpn1-7-88.ams2.redhat.com [10.36.7.88])
	by int-mx02.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id r63KeULT007023
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Wed, 3 Jul 2013 16:40:31 -0400
Message-ID: <51D48C3D.90708@redhat.com>
Date: Wed, 03 Jul 2013 22:40:29 +0200
From: Michael Stahl <mstahl@redhat.com>
Organization: Red Hat
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130612 Thunderbird/17.0.6
MIME-Version: 1.0
To: Lionel Elie Mamane <lionel@mamane.lu>
CC: Eike Rathke <erack@redhat.com>, Stephan Bergmann <sbergman@redhat.com>,
        mmeeks@suse.com
Subject: Re: TimeZone questions
References: <51CDAF74.2020803@redhat.com> <20130701190243.GC10488@isigqoko.erack.de> <20130703085944.GA8315@capsaicin.mamane.lu> <20130703093300.GA9429@capsaicin.mamane.lu> <20130703131749.GB3086@isigqoko.erack.de> <20130703142305.GA9559@capsaicin.mamane.lu>
In-Reply-To: <20130703142305.GA9559@capsaicin.mamane.lu>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
X-Scanned-By: MIMEDefang 2.67 on 10.5.11.12
Status: RO
X-Status: A
Content-Length: 5893

On 03/07/13 16:23, Lionel Elie Mamane wrote:
> On Wed, Jul 03, 2013 at 03:17:49PM +0200, Eike Rathke wrote:
>> On Wednesday, 2013-07-03 11:33:00 +0200, Lionel Elie Mamane wrote:
> 
>>>> The discussion question is whether we want to make the
>>>> difference between 05:12:30.541+01 and 06:12:30.541+02?
> 
>>> The underlying assumption (by both Michael and me) is that if this is
>>> the case, our hands are tied and we need to put a timezone member
>>> in com.sun.star.util.Date/Time/DateTime to preserve that information.
> 
>>> I now realised that this is not true. We could make the choice that
>>> css.util.* has (for timezoned values) UTC time. If some code needs
>>> to store the information that this place in the file should, when
>>> writing the file, get the time/datetime in zone UTC+0500, then that
>>> code that store that information *separately* (either in another
>>> variable or by using a pair<cssu.DateTime, sal_Int16>).
> 
>> I doubt we want to change all places where times are passed to pass
>> an additional timezone information.
> 
> I doubt that, too. *Some* places need a "point in time", and will
> *not* add a timezone information to the UTC time. *Some* places will
> need additional timezone information and then, well, they will need to
> add it.
> 
>> Also, for transporting a timezoned time that would mean the caller would
>> have to extract the timezone and convert the time to UTC, just so the
>> receiver assembles the split information into a timezoned time again.
>> This is unnecessary and should be avoided.
> 
> That's where a "datetime + timezone" data structure would be far more
> convenient rather than "UTC datetime + timezone".
> 
> Hmm... Brainwave... Since we have an explicit notion of "(date)time in
> unknown timezone", what about "datetime in unknown timezone +
> timezone". Semantically, that's exactly the information.
> 
> However, I'm increasingly of the opinion there should just be at least
> two distinct datatypes. In the terms of my definition on IRC:
> 
> 1) legacy "some time in unknown timezone"
> 2) a point in global timeline (assuming Newtonian physics and its
>    global time)
> 3) the combination of a "point in global timeline" and a region of
>    earth where calendars and clocks agree.
> 
> IMO 2) and 3) should be distinct datatypes, because they are not the
> same concept. They are both useful concepts, and we should provide
> both to our users.

"should" perhaps, but how do you want to get there?

basic outline of your design:

index ce33ff0..bb4b7a8 100644
--- a/offapi/com/sun/star/util/DateTime.idl
+++ b/offapi/com/sun/star/util/DateTime.idl
      */
     short Year;

+    /** true: UTC false: unknown timezone.
+
+         @since LibreOffice 4.1
+      */
+    boolean IsUTC;

 };

+ /** represents a combined date+time value.
+
+         @since LibreOffice 4.1
+  */
+ struct DateTimeWithTZ : DateTime
+ {
+     /** contains the time zone, as signed offset in minutes from UTC. */
+     short TimeZone;
+ };


here are some problems:

1) as i just discovered, VCL does not have a DateTime widget, but a
   DateField and a TimeField. this means that the VCL widgets cannot
   transparently convert from UTC to local timezone, because they
   would need both date+time to do that.

   so in order to display local times to the user _every_ dialog and
   UI element with one of these needs to be adapted.

2) UNO attributes: if there's an attribute (not a property) of type
   DateTime then it's impossible to set or get a DateTimeWithTZ from it,
   unless the type is changed to DateTimeTZ, in which case DateTime
   will no longer work with it.

  (UNO Attributes are translated to getFoo() / setFoo method in
   C++/Java)

   so if an interface is to handle both it would be necessary to add a
   new attribute of type DateTimeTZ... but adding a new attribute means
   adding new virtual functions, which then breaks even clients of the
   interface that do not care about DateTime at all!

   XDocumentProperties is affected by this.  but probably it doesn't
   need to support DateTimeTZ.

   office::XAnnotation too, apparently for the comments in the margin.

3) to avoid problem 2) for Properties it's necessary to derive the
   DateTimeWithTZ from the DateTime, which ought to make
   Any::operator>>= work.

   but then we have a subtle problem it's far too easy to accidentally
   convert a DateTimeWithTZ to a DateTime e.g. by copy-constructing ...
   which is in UTC.
   (or local time zone, doesn't matter - same problem).

   of course we could make DateTimeTZ a distinct type but then
   some existing properties need to be duplicated.

4) usage as return type or parameter type - essentially the same as 2)

   there are such uses in sdb/sdbc, i bet you know best whether they
   need adaption or additional methods.

so while your proposal has merit from an abstract point of view, the
migration path towards implementing it appears to me to break a lot more
things than would be desirable.

but with "local time" instead of UTC it actually looks doable.

except of course "Date" - that would always need Optional<short> TZ, a
bool is completely useless here.  hmm... or we can just leave the bool
out of the Date, and instead have a DateWithTZ too.

so given all the above i'd adapt your proposal to:

1. convert everything to local time on import (except if a
   DateTimeWithTZ is requested, then of course not)
   [ note: local timezone offset determined once at startup ]

2. add "bool IsTimeZoned" to DateTime/Time/DateTimeRange
   (which implies it's known to be in the local timezone)

3. add DateTimeWithTZ which does _not_ derive from DateTime,
   but contains DateTime and short members
   (similar for Date and Time)

... and where in the API would we actually want to support a new
DateTimeWithTZ type?

i haven't actually found a screamingly obvious use-case for it yet...



From lionel@mamane.lu Thu Jul  4 12:27:58 2013
Date: Thu, 4 Jul 2013 12:27:58 +0200
From: Lionel Elie Mamane <lionel@mamane.lu>
To: Michael Stahl <mstahl@redhat.com>
Cc: Eike Rathke <erack@redhat.com>, Stephan Bergmann <sbergman@redhat.com>,
	mmeeks@suse.com
Subject: Re: TimeZone questions
Message-ID: <20130704102758.GA17393@capsaicin.mamane.lu>
References: <51CDAF74.2020803@redhat.com>
 <20130701190243.GC10488@isigqoko.erack.de>
 <20130703085944.GA8315@capsaicin.mamane.lu>
 <20130703093300.GA9429@capsaicin.mamane.lu>
 <20130703131749.GB3086@isigqoko.erack.de>
 <20130703142305.GA9559@capsaicin.mamane.lu>
 <51D48C3D.90708@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-15
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <51D48C3D.90708@redhat.com>
X-Operating-System: GNU/Linux
X-Request-PGP: http://www.mamane.lu/openpgp/rsa_v4_4096.asc
User-Agent: Mutt/1.5.21 (2010-09-15)
Status: RO
Content-Length: 13795

On Wed, Jul 03, 2013 at 10:40:29PM +0200, Michael Stahl wrote:
> On 03/07/13 16:23, Lionel Elie Mamane wrote:
>> On Wed, Jul 03, 2013 at 03:17:49PM +0200, Eike Rathke wrote:
>>> On Wednesday, 2013-07-03 11:33:00 +0200, Lionel Elie Mamane wrote:

>>>>> The discussion question is whether we want to make the
>>>>> difference between 05:12:30.541+01 and 06:12:30.541+02?

>>>> The underlying assumption (by both Michael and me) is that if this is
>>>> the case, our hands are tied and we need to put a timezone member
>>>> in com.sun.star.util.Date/Time/DateTime to preserve that information.

>>>> I now realised that this is not true. We could make the choice that
>>>> css.util.* has (for timezoned values) UTC time. If some code needs
>>>> to store the information that this place in the file should, when
>>>> writing the file, get the time/datetime in zone UTC+0500, then that
>>>> code that store that information *separately* (either in another
>>>> variable or by using a pair<cssu.DateTime, sal_Int16>).

>>> Also, for transporting a timezoned time that would mean the caller would
>>> have to extract the timezone and convert the time to UTC, just so the
>>> receiver assembles the split information into a timezoned time again.
>>> This is unnecessary and should be avoided.

>> That's where a "datetime + timezone" data structure would be far more
>> convenient rather than "UTC datetime + timezone".

>> Hmm... Brainwave... Since we have an explicit notion of "(date)time
>> in unknown timezone", what about "datetime in unknown timezone +
>> timezone". Semantically, that's exactly the information.

>> However, I'm increasingly of the opinion there should just be at least
>> two distinct datatypes. In the terms of my definition on IRC:

>> 1) legacy "some time in unknown timezone"
>> 2) a point in global timeline (assuming Newtonian physics and its
>>    global time)
>> 3) the combination of a "point in global timeline" and a region of
>>    earth where calendars and clocks agree.

>> IMO 2) and 3) should be distinct datatypes, because they are not the
>> same concept. They are both useful concepts, and we should provide
>> both to our users.

> "should" perhaps, but how do you want to get there?

> basic outline of your design:

> index ce33ff0..bb4b7a8 100644
> --- a/offapi/com/sun/star/util/DateTime.idl
> +++ b/offapi/com/sun/star/util/DateTime.idl
>       */
>      short Year;
> 
> +    /** true: UTC false: unknown timezone.
> +
> +         @since LibreOffice 4.1
> +      */
> +    boolean IsUTC;
> 
>  };
> 
> + /** represents a combined date+time value.

represents a combined date+time+timezone value.

> +
> +         @since LibreOffice 4.1
> +  */
> + struct DateTimeWithTZ : DateTime
> + {
> +     /** contains the time zone, as signed offset in minutes from UTC. */
> +     short TimeZone;
> + };

(There is some nitpicking here to be had about what the meaning of a
 DateTimeWithTZ with IsUTC=true vs IsUTC=false would be, but we will
 leave such details for later / when it becomes appropriate, and
 concentrate on the big general principles here, namely the choice
 between "add TimeZone to DateTime or make it a separate DateTimeTZ
 type), also future details: good idea or not to derive the type, if
 yes then maybe derive from a common base that does not have isUTC, or
 make a new DateTimeUTC that derives from DateTime or ...)

I think mmeeks basically wanted to drop IsUTC, or at least make it
some internal-only flag that is not visible to UNO / our outside
programming interface, so as not to "pollute" it with legacy
stuff. Also leaving that aside for now.


> here are some problems:

> 1) as i just discovered, VCL does not have a DateTime widget, but a
>    DateField and a TimeField. this means that the VCL widgets cannot
>    transparently convert from UTC to local timezone, because they
>    would need both date+time to do that.

>    so in order to display local times to the user _every_ dialog and
>    UI element with one of these needs to be adapted.

So, the situation without any adaptation is:

1) They get a (Date|Time) with isUTC=false: no change compared to now,
   they display the content of the value.

2) They get a (Date|Time) with isUTC=true: they are buggy, don't cope,
   show wrong value.

How can they be adapted, other than by always giving them a DateTime
and never a Date, nor a Time? Well, they cannot, they can only be
buggy.


Let's compare to the previous proposal, which was essentially to have
only "(Date)?(Time)?WithTZ", and calling it "(Date)?(Time)?".

1) They get a (Date|Time) with: TimeZone.IsPresent=false

   Same situation as now: they display the content of the value.

2) They get a (Date|Time) with
   TimeZone.IsPresent=true
   TimeZone.Value=NN

   They are buggy, don't cope, show wrong value.


How can they be adapted, other than by always giving them a DateTime
and never a Date, nor a Time? Well, they cannot, they can only be
buggy.


It seems to me that this problem is not specific to my proposal, but
uniform in both proposals.

> 2) UNO attributes: if there's an attribute (not a property) of type
>    DateTime then it's impossible to set or get a DateTimeWithTZ from it,
>    unless the type is changed to DateTimeTZ, in which case DateTime
>    will no longer work with it.

Yes, that's just like it is "impossible" to set/get a double from an
int property, unless the type is changed to double, in which case int
will not longer work with it.

Unless the compiler or the programmer inserts a conversion.

>    so if an interface is to handle both

Semantically, the interface should not have to handle both, except as
a legacy backwards compatibility wart.

Either the attribute is a "point in global timeline" (a "2" in my list
above), and then the interface has no business doing anything with
DateTimeTZ. If calling code has a DateTimeTZ, it can extract a
DateTime from it (and we will provide utility functions that will do
that), either explicitly (by the programmer) or automatically (by the
compiler). Just like when setting an int, you can convert from a
double.

Or there is an attribute that is DateTime now, but that should be a
DateTimeNZ, well, that was a bug in the API, and yes, we will have
some pain to correct it. To give a different example than the
int/double, it is similar to the following imaginary scenario:

A putative "Length" attribute that used to take a:

struct OneDimensionMeasure { long val; /* number of micrometers */ };

So, if you had a number of postscript points or a number of inches,
you had to convert it to micrometers, and getLength() would always
give you micrometers.

But now, for some reason, we want to be able to set it directly to the
caller's choice of inches, cm, mm, µm, ..., and when the value is
read, it should be expressed in the same unit as it was set. If it was
set to 2 inches, it should give back 2 inches, and not 80500µm,
although that is the same length. If it was set to 80500µm, it should
give back 80500µm and not 2 inches.

So we add a new

enum unit { MM CM µM INCH };

struct OneDimensionMeasureWithUnit { long val; unit aUnit; };

Depending on the level of backwards compatibility we want, we either:

 - change the type of Length to OneDimensionMeasureWithunit (and
   that's a violently incompatible API and ABI change)
*or*
 - introduce a new "Length2" or "LengthWithUnit" attribute of type
   OneDimensionMeasureWithunit, and in the implementation one
   attribute implicitly also changes the other one (and that's a
   compatible API but an incompatible ABI).


While the other proposal is to just change OneDimensionMeasure

struct OneDimensionMeasure { long val; unit aUnit; };

and force all callers/users to cope with multiple units, although they
don't necessarily care about the difference between 2 inches and
80500µm.


The example is a bit contrived, because IMHO in this example Length
should stay of type OneDimensionMeasure, and a caller that has a
OneDimensionMeasureWithUnit should just convert (through a utility
function that is provided by LibreOffice). But frankly, I expect (with
a rather low level of confidence) that will be the situation for most,
if not all, our DateTime attributes, too. Eicke seems to suggest that
most/all our DateTime attributes should be DateTimeTZ?


> 3) to avoid problem 2) for Properties it's necessary to derive the
>    DateTimeWithTZ from the DateTime, which ought to make
>    Any::operator>>= work.

>    but then we have a subtle problem it's far too easy to accidentally
>    convert a DateTimeWithTZ to a DateTime e.g. by copy-constructing ...
>    which is in UTC.

>    of course we could make DateTimeTZ a distinct type but then
>    some existing properties need to be duplicated.

Same remark about duplication; may be needed as backwards
compatibility wart.


> 4) usage as return type or parameter type - essentially the same as 2)

>    there are such uses in sdb/sdbc, i bet you know best whether they
>    need adaption or additional methods.

I assume this is about (get|update|set|write|read)(Date|Time|Timestamp).
In order to add support for TimeTZ and DateTimeTZ, I'd add
(get|update|set|write|read)(Time|Timestamp)TZ
methods. It is as if we added supports for arrays, or GIS data (such
as latitude+longitude), ... Each data type has its own get/update/set
method.

> so while your proposal has merit from an abstract point of view, the
> migration path towards implementing it appears to me to break a lot more
> things than would be desirable.

The 'add timezone to DateTime' proposal also breaks a lot of stuff,
and possibly in equally/more subtle ways: code that is not prepared to
handle a timezone will just blindly use the year / month / date /
hour, and the programming language's type system will not (and can
not) complain in the least.

While different DateTime / DateTimeTZ types allow the type system to
help in the transition (or is that just my lambda-calculus / theorem
prover background showing?)

> except of course "Date" - that would always need Optional<short> TZ, a
> bool is completely useless here.  hmm... or we can just leave the bool
> out of the Date, and instead have a DateWithTZ too.

Oh yes, definitely, Date and DateWithTZ are not the same concept /
data.


> but with "local time" instead of UTC it actually looks doable.

I dislike "local time" as an imperfection, but I'm not completely
vetoing it as a "make the transition easier" concession.

> so given all the above i'd adapt your proposal to:

> 1. convert everything to local time on import (except if a
>    DateTimeWithTZ is requested, then of course not)
>    [ note: local timezone offset determined once at startup ]

> 2. add "bool IsTimeZoned" to DateTime/Time/DateTimeRange
>    (which implies it's known to be in the local timezone)

> 3. add DateTimeWithTZ which does _not_ derive from DateTime,
>    but contains DateTime and short members
>    (similar for Date and Time)

> ... and where in the API would we actually want to support a new
> DateTimeWithTZ type?

> i haven't actually found a screamingly obvious use-case for it
> yet...

Well, I essentially agree with the "no screamingly obvious
use-case". Which makes it ever the more a bad idea for me to switch
*everything* to DateTimeWithTZ, which is what the other proposal
wants. But since Eike (and you?) were saying there were usecases, I'm
opening to supporting them.



Now, I'm thinking aloud... Suppose our goals are to maximise
compatibility with legacy, cleanliness of future and safety of the
conversion (that is, the type system and the compiler shout at us for
bugs instead of silently letting them through). Then, what about
separate types that don't inherit.

DateTime (the legacy one)
DateTimeUTC with exactly the same members, but we know it is UTC
DateTimeTZ  with same members and a timezone (offset) added

Then, we slowly convert our codebase so that DateTime contains a local
time (that's the least unreasonable to serve legacy consumers). In
C++ for our internal code, we can add:

struct DateTimeUTC { operator DateTime() } // returns local time
// but *not* struct DateTime { operator DateTimeUTC() }, that's unsafe
struct DateTimeTZ { operator DateTimeUTC() }
struct DateTimeTZ { operator DateTime() } // returns local time


So we are actually in a rather good situation that we can gradually
switch our attributes, properties, return types, argument types from
DateTime to DateTimeUTC, cleaning up the codebase as we go along. In the
end, our code does not use UTC anymore:

 - assume "DateTime f(foo, bar)" is switched to "DateTimeUTC f(foo, bar)"
   all callers of f that expect a DateTime will have the conversion
   operator make the conversion for them.

 - assume "foo f(DateTime)"

   g() calls f() and we convert an internal variable of g of type
   DateTime to DateTimeUTC. The call still succeeds, converting from
   DateTimeUTC to DateTime.

 - foo f(const &DateTime) is OK as above

 - foo f(&DateTime) is a problem

 - assume "foo f(DateTime)" is switched to "foo f(DateTimeUTC)"
   callers of f that pass it a DateTime will fail to compile.
   -> we have to convert first the callers.

   all callers of f that expect a DateTime will have the conversion
   operator make the conversion for them.


Our users are left more in the cold than our internal code, though. We
would essentially continuously break API until we are finished with
the conversion... That's rather sucky as a programming platform.

Maximal backwards API compatibility would happen through duplicated
properties and api, etc. Each "f(DateTime)" doubled with a
"fUTC(DateTime)", and deprecate "f".

Essentially, it boils down to our priorities between backwards
compatibility and cleanliness of end result / length of the transition
period.


-- 
Lionel

From erack@redhat.com Wed Jul  3 12:05:03 2013
Return-path: <erack@redhat.com>
Envelope-to: lionel@mamane.lu
Delivery-date: Wed, 03 Jul 2013 12:05:04 +0200
Received: from agate.conuropsis.org ([2a02:898:36:10b:e4c2::1])
	by capsaicin.mamane.lu with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256)
	(Exim 4.80)
	(envelope-from <erack@redhat.com>)
	id 1UuJvr-0002XI-Sw
	for lionel@mamane.lu; Wed, 03 Jul 2013 12:05:03 +0200
Received: from mx1.redhat.com ([209.132.183.28])
	by agate.conuropsis.org with esmtp  (Exim 4.72)
	id 1UuJvr-0005GY-4J
	for lionel@mamane.lu; Wed, 03 Jul 2013 12:05:03 +0200
Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23])
	by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r63A4wsw031896
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK)
	for <lionel@mamane.lu>; Wed, 3 Jul 2013 06:04:58 -0400
Received: from localhost (ovpn01.gateway.prod.ext.ams2.redhat.com [10.39.146.11])
	by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id r63A4urN025302
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO);
	Wed, 3 Jul 2013 06:04:57 -0400
Date: Wed, 3 Jul 2013 12:04:55 +0200
From: Eike Rathke <erack@redhat.com>
To: Lionel Elie Mamane <lionel@mamane.lu>
Cc: Michael Stahl <mstahl@redhat.com>, Stephan Bergmann <sbergman@redhat.com>
Subject: Re: TimeZone questions
Message-ID: <20130703100455.GA3086@isigqoko.erack.de>
References: <51CDAF74.2020803@redhat.com>
 <20130701190243.GC10488@isigqoko.erack.de>
 <20130703085944.GA8315@capsaicin.mamane.lu>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha256;
	protocol="application/pgp-signature"; boundary="tKW2IUtsqtDRztdT"
Content-Disposition: inline
In-Reply-To: <20130703085944.GA8315@capsaicin.mamane.lu>
X-Accept-Language: de,en
X-Nickname: erAck
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Scanned-By: MIMEDefang 2.68 on 10.5.11.23
Status: RO
Content-Length: 3657


--tKW2IUtsqtDRztdT
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Hi Lionel,

On Wednesday, 2013-07-03 10:59:44 +0200, Lionel Elie Mamane wrote:

> >> 2) convert all times to UTC on import
>=20
> > No. As already mentioned elsewhere and in my comment to the gerrit
> > change above, the document content of times is not expected to
> > change when loaded in a different timezone.
>=20
> My understanding of what Michael meant was "convert all timezoned
> times to UTC". Times without timezone (neither explicit, nor implicit
> by the file format) stay untouched, with the "IsTimeZoned" flag set to
> false.

My point is that also times with timezone information should stay
untouched and not be converted unless the consumer does not know about
timezones and then it's up to the individual case how to proceed,
usually just strip the timezone and don't convert, thus treat it as
un-timezoned.


> >> 3) same as 2) except convert all times to _local_ time zone on import
>=20
> > No for the same reason.
>=20
> Same remark. This is about the internal storage and API interface data
> type of *timezoned* values *only*. Do we have them as:
>=20
>  - hour, minute, second, fraction_of_second and explicit offset

Yes.

> >> the real question is, is there any use-case for _preserving_ an existi=
ng
> >> time-zone that is read from somewhere; only option 1) can do that.
>=20
> > Also option 4) if I didn't misread, as you wrote "values without
> > explicit time zone", which implies values with explicit timezone are to
> > be treated different.
>=20
> There is a misunderstanding. The question being discussed is not the
> difference between timezoned data and untimezoned data.
>=20
> The discussion question is whether we want to make the
> difference between 05:12:30.541+01 and 06:12:30.541+02? That is,
> if a file contains 05:12:30.541+01, is it OK if when rewriting
> the file we write any one of:
>=20
>  04:12:30.541Z
>  04:12:30.541+00
>  05:12:30.541+01
>  06:12:30.541+02
>  07:42:30.541+0230

No.

> OR do we have to write one of these:
>=20
>  05:12:30.541+01
>  05:12:30.541+0100
>  05:12:30.541000+01
>  05:12:30.541000000+01
>  051230.541000000+01

Yes.

> If you tell me that one of
>  05:12:30.541+0100
>  05:12:30.541000000+01
> is not acceptable, then we have a bigger problem.

I don't see why these should not be acceptable?

  Eike

--=20
LibreOffice Calc developer. Number formatter stricken i18n transpositionize=
r.
GPG key ID: 0x65632D3A - 2265 D7F3 A7B0 95CC 3918  630B 6A6C D5B7 6563 2D3A
For key transition see http://erack.de/key-transition-2013-01-10.txt.asc
Support the FSFE, care about Free Software! https://fsfe.org/support/?erack

--tKW2IUtsqtDRztdT
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)

iQIcBAEBCAAGBQJR0/dHAAoJEGps1bdlYy06KqsP/3qykOUMVhz8kkROWUp7Gscp
0XNltZWj0uu5b4SAkfT5vrVybaRLrPLHPj3Lbu/klZZmp7e5NIVEQVstCsRmstYy
cd1azzozkz1B0DkEr5Ph/gB0/MllnV+44VF6lyoRQO0cJuhH1J/hqrCuvFSVzlMx
Gw0cwlYy/RFIK585QGOoDG2g5fVxgvsY2VShrcD+pBC0A2TyimtOEPFxVY9nExNb
CvPNYwPoTS+T3cF7eDmk3/l6qnH7Mt/kpgF4c6K0FkC6sV4syq1xMs4kZCg1tJbn
J1PeoHZlJ3etI/7+ryXz6ltWpGdlu/wy7/Hg/wKQc1M9tcylo6ae+MLGEKRA+MHZ
zZ7Fgem1wEQ7aMg/S+9yxBK3K/t3U3XgP9SGvJNAhoEL1YLynXs93MKR+TdPsIHm
HkaHtVPbscE0YtL482KzRCH2KWGEwDzFrzg6fg0MnrFZO7BCQnNx6YWFmuTmxG5x
D02aasf9djJqNqlTO1ISVlv3g4fcD5xMB+So78OfLU/EeYU8Z9Up5jsqfPL0kw4P
pQbdtwLWPsPnYYyjNrZ1ED1JegagAEj7KPb7otPW62TzjhDn5+44VGL0httG9bvI
jjwjHzTKjWYLrR5qGwmIGnS0TwRaFqfZjr03tDtQXD/T+/F8b2UBhw/yYnQdR3XS
iyLOu4bDbAk3orbekB8M
=T7EK
-----END PGP SIGNATURE-----

--tKW2IUtsqtDRztdT--


