Commit Graph

69 Commits

Author SHA1 Message Date
Jeremy Stanley
cb9c9cf74b Include time zones in WX weather zone data
Some NWS products, such as forecasts and other alerts, have
expiration times marked relative to the issuing authority's local
time zone. In preparation for being able to calculate expirations
accurately relative to the user's local time, embed IANA TZDB
compliant identifiers in all WX weather zone definitions, translated
from their NWS time zone codes.

Correlation files are all updated because of amending overrides
accordingly.
2024-05-09 19:24:08 +00:00
Jeremy Stanley
e10448eb09 Overhaul alert URLs for current NWS usage
NWS switched to using FIPS based county designations for flood
warnings around 2016, so fix the URLs for them. Also flash flood
statements and warnings (but nor watches which, for some reason,
still use WX weather zones). Severe weather statements too (but not
special weather statements nor urgent weather messages). Oh, and
thunderstorms.

Add a new tornado alert type with relevant URLs, and fix
configuration examples which reference the older tornado_warning
field which hasn't been available for years.

While at it, remove the separate flood statement alert which seems
to be entirely unused by NWS.
2024-05-09 01:44:09 +00:00
Jeremy Stanley
04e5caae2d Loosen alert/forecast expiry filter by 24 hours
Since NWS alerts and forecasts list their expiration times relative
to the issuing office's local timezone, filtering for expired
documents relative to the users timezone can lead to them being
filtered early when they're not both coincidentally the same.

Introduce a one day (86400 second) offset buffer as a simple
workaround for now, since the user's and issuing authority's
timezones shouldn't ever differ by more than that. This has a
downside of showing forecasts or alerts which have expired and not
been replaced, but that was possible already if timezones differed
in the other direction, and is preferable to the alternative.

The NWS DBX schema for WX weather zones does include a field for a
timezone code, so a future change may introduce more accurate
calculations in order to identify the relative offset between the
user and issuer, but this will require extending our own zones
format to add a new value for it.
2024-05-08 13:27:12 +00:00
Jeremy Stanley
68518f4631 Refresh correlation data
Switch to the 2023 US Census Bureau data, March 2024 NWS WX zones,
latest OurAirports open data set, refresh active forecast and
station lists, and clean up obsolete overrides. Regenerate all
correlation sets based on these updated sources.
2024-05-08 02:18:49 +00:00
Jeremy Stanley
fc5a61d2b2 Replace ConfigParser's readfp with read_file
Python 3.12 drops the deprecated readfp method from ConfigParser, so
use the newer read_file method instead.
2024-05-08 02:15:44 +00:00
Piraty
757b9658f9 Use raw strings for regular expressions
Python 3.12 requires raw strings for many character sequences used
in regular expressions. Originally submitted by Piraty, thanks!
2024-05-08 00:17:35 +00:00
Jeremy Stanley
54e4040d60 Correct FAQ entry about condition filters
Condition filter customization wasn't documented quite right in the
FAQ, and got the space-to-underscore replacement situation
backwards.
2024-05-08 00:13:25 +00:00
Jeremy Stanley
4dab9400e7 Refresh correlation data
Clean up the overrides.conf a bit and regenerate correlations now
that the NWS has retired some WX weather zones. This addresses some
incorrect correlations to now dead forecasts in the Los Angeles
area (thanks to Chime Hart for reporting), as well as likely others.
2023-03-14 13:27:24 +00:00
Jeremy Stanley
53cc0f3b23 Restore missing WX zones
Due to a filtering mistake when assembling the zlist file, 2.4.3
shipped with no NWS WX zone associations. Brown bag fix releasing as
2.4.4.
2023-02-09 21:50:35 +00:00
Jeremy Stanley
6ac3deabd4 Prepare for 2.4.3 release
Switch to the 2022 US Census Bureau data, March 2023 NWS WX zones,
latest OurAirports open data set, and refreshed active forecast and
station lists. Regenerate all correlation sets based on these
updated sources.
2023-02-09 21:00:02 +00:00
Jeremy Stanley
0ec4a4d5d5 Don't use U mode, removed in python3.11
This patch was helpfully submitted by Bas Couwenberg to drop use of
the universal newline flag, since Python 3.11 no longer supports it.
It probably breaks the ability to build new correlation files under
Python 2.7 and earlier, but since it shouldn't affect operation of
the utility with prebuilt correlations (the way it's typically
used), this isn't yet considered to drop Python 2.7 support
altogether.
2023-02-09 17:00:54 +00:00
Jeremy Stanley
8b903a7f49 Drop vestigial import of the tarfile module
The correlate() function stopped needing tarfile a couple of years
ago (version 2.4), but it was overlooked that the script continued
to unnecessarily import it. Clean this up.
2022-09-22 13:28:01 +00:00
Jeremy Stanley
22d32da112 Refresh correlation data
A new correlation data build, based on more recent active WX zones
(notably, the Los Angeles California area was re-zones earlier this
year).
2022-06-11 18:02:37 +00:00
Jeremy Stanley
d896055aac Clean up .gitignore file
Remove the coop-stations.txt file from the ignore list, since it's
not been used for some time.
2022-06-11 17:37:35 +00:00
Jeremy Stanley
455afdc07d Force UTF-8 locale when reading configs and data
Apparently, Python on Windows defaults to assuming CP1252 encoding
unless otherwise specified, as opposed to the UTF-8 assumption made
on POSIX platforms. Since our configuration and data files are
expected to always use UTF-8 encoding, be clear in the
ConfigParser.read() calls about that. We only do this under Python
3.x, as that method doesn't have an encoding parameter in 2.7.

Thanks to Lance Bermudez for reporting this.
2021-12-17 16:29:38 +00:00
Jeremy Stanley
257f9f9a0b Refresh correlation data and update copyright year
Just a basic correlation update based on more recent active METAR
station and WX zone lists. Also update the copyright year for files
which have been edited so far in 2021 as well as in the LICENSE
file.
2021-11-24 14:57:15 +00:00
Jeremy Stanley
cbdccf95dc Correct handling of boolean selections
The selections proxy class, which mashes together command-line
arguments and configuration options, contained a longstanding and
fatal flaw with its handling of boolean values. In particular,
falsey values were consistently treated as truthy due to naively
recasting str to bool (which will always yield True unless empty).
This went unnoticed for so long because the majority of these
settings default to False, meaning the only reason most users had to
set them was to override them to True.

Many thanks to Jordan Russell for bringing this bug to my attention,
and for supplying an initial patch on which this fix is heavily
based.

Co-Authored-By: Jordan Russell
2021-11-10 18:59:18 +00:00
Jeremy Stanley
d556621fea Refresh active station list and correlation data
Perform a fresh build of data sets from current sources, and add a
few additional overrides for previously unknown stations. Also
update from 2019 to 2021 US Census data, from March 2020 to
September 2021 CountyZone maps, and from 2020-08-29 to 2021-08-29
airport IDs and active METAR stations/WX zones.
2021-08-29 17:57:19 +00:00
Jeremy Stanley
847a98636e Correct and simplify URLError exception handling
Julien Palard pointed out that the way URLError exceptions were
being manually cobbled into the stderr stream wasn't quite working
(thanks!), but it was also unnecessarily complicated for reasons I
don't recall now. Rip most of it out and just go with a basic
catch/error/re-raise there instead.
2021-06-01 15:31:39 +00:00
Jeremy Stanley
038e2d65a3 Prepare for 2.4.1 release
Update the version string in the project and manpages.
2020-08-30 18:08:25 +00:00
Jeremy Stanley
4b17a82e15 Refresh correlation data
Another data refresh, preparing for an upcoming release.
2020-08-30 18:07:36 +00:00
Jeremy Stanley
16188d7157 Update correlation data
Normalize the overrides, remove some stale overrides for defunct
stations, and regenerate the correlation data set.
2020-08-01 19:21:12 +00:00
Jeremy Stanley
230ad93b11 Refresh active station list and correlation data
Perform a fresh build of data sets from current sources, and add a
few additional overrides for previously unknown stations.
2020-07-26 19:10:14 +00:00
Jeremy Stanley
a1fe816de3 Make missing alert URLs non-fatal
As a more complete fix and future-proofing for the earlier mismatch
between default_atypes and the alert URLs generated for WX zones
during correlation, stop aborting and simply add a warning if a
requested alert type has no corresponding URL.
2020-07-26 19:09:14 +00:00
Jeremy Stanley
9fa7d3cf18 Correct default_atypes to match what's generated
Kevin Monceaux reported a regression with the 2.4 release. Running
with the -a/--alert option and no limited --atypes or atypes
override in weatherrc resulted in a message about undefined URLs
and no normal output. This problem crept in when hard-coding alert
types in the correlator after ditching the woefully unmaintained
zonecatalog.curr.tar data source (commit 8a37edd).

Update default_atypes so that it covers all relevant non-forecast
URLs the correlate routine embeds.
2020-07-26 19:04:33 +00:00
Chris Lamb
8ad1f3f02c Make the build reproducible
While auditing Debian's packages, Chris Lamb reported[*] that
weather-util's correlation set generation is not reproducible
because it embeds timestamps without a means to override them and
also varies by system timezone. Allow SOURCE_DATE_EPOCH from the
calling environment and assume UTC rather than relying on locale
settings when no timezones are specified.

[*] https://bugs.debian.org/964721
2020-07-25 15:23:20 +00:00
Jeremy Stanley
9e7002438b Update release notes and docs for 2.4 release 2020-06-21 20:20:33 +00:00
Jeremy Stanley
1bbdcd7e89 Get correlate() working in modern Python 3
Update a bunch of the parsing for various correlation source files
to work in both Python 2.7 and 3.5+, mostly where str vs bytes and
UTF-8 encoding/decoding are concerned. This can be cleaned up
significantly once support for 2.7 is finally dropped.
2020-06-21 20:13:14 +00:00
Jeremy Stanley
0a4712f9a8 Be more thorough about file copyrights
Add a copyright header to the .gitignore file with start and end
years determined from its commit history. Add copyright headers for
the current year to overrides.log and qa.log, and also add
functionality to correlate() which adds these headers from now on.
Update the copyright year on overrides.conf, which was missed in
8a37edd and later commits. All files tracked in this repository now
declare a copyright and refer to the main LICENSE file for licensing
terms.
2020-06-21 13:46:39 +00:00
Jeremy Stanley
62b0ce6d9d Don't use "is" with a literal to test for equality
Solve a SyntaxWarning under Python 3.8 and later for use of the "is"
identity operator when comparing literals, by replacing with the
"==" equality operator.
2020-06-02 14:10:49 +00:00
Jeremy Stanley
28a5c0e1e6 Caching support for URLs with port numbers
When mangling URLs of fetched data to store in the local cache, only
split on the first colon so that URLs with port numbers in them are
properly differentiated. Previously, all URLs for the same domain
name landed in a single file if a port number was included, causing
incorrect results to be returned from the cache.
2020-05-31 15:11:32 +00:00
Jeremy Stanley
74a9ee62f6 Use a dedicated field for cached search timestamps
Fix a cache corruption issue by using a new "cached" field to hold
the timestamp for cached correlation search results. Previously the
"description" field was being overloaded, but this could cause the
cache to no longer load because of duplicate fields.
2020-05-31 00:24:10 +00:00
Jeremy Stanley
5515f756d4 Decode retrieved files as UTF-8 even on Python 2
Python 2.7 is likely the only Python 2 anyone is using any longer
(even that's well past EOL upstream now), and reasonably recent
versions of 2.7 it need the same decode hack as Python 3 anyway when
dealing with some retrieved content. Just get rid of the version
detection and do it under any version.
2020-05-31 00:17:36 +00:00
Jeremy Stanley
fd4b0ae5b2 Add weather zone hkz000 for Hong Kong Observatory
Thanks to Bill Agee for suggesting the Hong Kong Observatory's
weather forecast page. A custom filter is implemented to strip the
forecast text from the HTML page in which it is embedded (if anyone
finds a plaintext version published at an alternate URL, let me know
and I'll rip out the extra routine).
2020-05-31 00:16:07 +00:00
Jeremy Stanley
8e0d7c6e1a Add some fresh overrides
Override a number of active weather stations where searches of
various online sites return names and locations for them which are
not provided by the included sources.
2020-05-24 19:18:35 +00:00
Jeremy Stanley
8a37eddc06 Update correlation sources
Remove the stale metar.tbl and zonecatalog.curr.tar, which the USA
NWS hasn't been updating for many years, and add the public domain
airports.csv file from the amazing ourairports.com community. Also
update to latest (2019) USA Census Bureau location data, March 2020
WX zone information, cooperative sites list from 2018 (latest), and
regenerated active station and zone lists. Loss of the zonecatalog
necessitates directly applying various forecast and alert URL
patterns, though some which appeared unused by NWS for many years
were not included.

Clear out all old overrides, since the vast majority are obsoleted
by refreshed data, and build fresh correlation sets from the above
sources. Basically all sites have switched from HTTP to HTTPS, so
update URLs for this too.
2020-05-24 16:27:58 +00:00
Jeremy Stanley
1ec2848c20 Add FAQ entry about --headers values
Be as clear as possible about what the --headers command-line option
or headers configuration option does, and what sorts of values are
valid for it.
2017-09-16 15:47:03 +00:00
Jeremy Stanley
2a84a53f4a Fix Py3K compatibility for compressed correlation
When run under Python 3.x, explicitly decode decompressed
bytestreams if reading pre-compressed correlation data files.
2017-03-10 15:13:31 +00:00
Jeremy Stanley
915f1b8d7d Update release notes and docs for 2.3 release 2016-11-08 00:03:25 +00:00
Jeremy Stanley
96808c892f Correct NOAA WX weather product URL regression
One piecemeal use of the retired weather.noaa.gov/pub URL was missed
in the correlate() function, causing it to be reintroduced for
zone-based reports (such as forecasts) in a subsequent correlation
dataset update. Correct the invalid URLs in the zones file, and
update the correlation routine to embed the correct and working
tgftp.nws.noaa.gov hostname instead.
2016-11-07 15:54:17 +00:00
Jeremy Stanley
aafdd48e1b Update release notes and docs for 2.2 release 2016-10-05 01:26:54 +00:00
Jeremy Stanley
840b91b4fd Correlation set update
* overrides.conf: Latest source data corrections from
script-assisted research.

These remaining files are generated data. Normally they're not
something I feel good about committing into version control, but in
this case it allows for logging and tracking deltas in the data over
time...

* airports: Removed 18 airports corresponding to nonexistent
stations.

* stations: Removed 326 stations with no recent conditions, added
429.

* zones: Removed 45 zones with no recent forecasts, added 104.

* places, zctas: Based on latest Census Bureau data corrections,
updated with new correlations.

* overrides.log: Record of correlation set build.

* slist, zlist: State of active stations and weather zones at the
time of generation.
2016-10-05 01:17:14 +00:00
Jeremy Stanley
a33dd24ec1 Update correlation logic for 2015 Gazetteer
Simple update to handle new filenames for the 2015 US Census Bureau
Gazetteer. Also update a comment which still had the old NWS station
list URL.
2016-10-05 00:49:32 +00:00
Jeremy Stanley
44546ef04d Correct setpath search order
Short-circuit the outer loop in setpath testing so that we stop
iterating through supplied path elements once a match is found.
2016-09-19 15:25:18 +00:00
Jeremy Stanley
c63284ca35 Fix minor typo in NEWS file
There was an extra "and" in the first sentence of the 2.1 release
prose.
2016-09-13 18:23:51 +00:00
Jeremy Stanley
c2edb3e07f Update release notes and docs for 2.1 release 2016-09-13 17:41:25 +00:00
Jeremy Stanley
e35fbc4d60 Update NOAA WX weather products URLs
Per http://www.wxforum.net/index.php?topic=29502.0 the old
http://weather.noaa.gov/pub site was deprecated and as of August 23
is no longer in service. Update the software and current dataset to
use working http://tgftp.nws.noaa.gov URLs instead.
2016-08-24 22:56:37 +00:00
Jeremy Stanley
92a0869395 Correlation set update
* overrides.conf: Latest source data corrections from
script-assisted research.

These remaining files are generated data. Normally they're not
something I feel good about committing into version control, but in
this case it allows for logging and tracking deltas in the data over
time...

* airports: Removed 527 airports corresponding to nonexistent
stations.

* stations: Removed 176 stations with no recent conditions, added
196.

* zones: Removed 5 zones with no recent forecasts.

* places, zctas: Based on latest Census Bureau data corrections,
updated with new correlations.

* overrides.log: Record of correlation set build.

* slist, zlist: State of active stations and weather zones at the
time of generation.
2014-11-10 22:15:35 +00:00
Jeremy Stanley
562fc1c1df Support more recent data sources
Add support for 2014 Census Bureau data and the newer version of the
NWS COOP stations file.
2014-10-30 00:26:01 +00:00
Jeremy Stanley
49a6ebe760 Document correlation set rebuilding process
* INSTALL: Add a new section documenting the way in which newer
correlation data sets can be rebuilt and substituted for officially
distributed copies.
2014-02-13 01:57:40 +00:00