This document is a revised version of the document of "B1 alpha" dated 24 April 2005 and mostly the same as it.
Addendum (02 December 2005): Some figures are shown in another page [link].
Global Soil Wetness Project, Phase 2 (GSWP2; Dirmeyer et al. 2002) is, beside being a model intercomparison project, an attempt to make a climatological data set of such hydrological variables as soil moisture and grid-box representative runoff, which are not directly measured. For the latter purpose, it is very desirable that the input data are as realistic as possible. Unfortunately, some significant biases were discovered in the input data set for the standard run (B0) after starting the production phase of the project. Some biases affect the data for the whole period. Others concern only the spin-up period, but affect such storage variables that have long time scales. In this respect, we consider that we should produce a revised set of standard input data (B1) which will be used for climatological studies. (It does not mean that we should redo all model intercomparison studies and sensitivity studies which can be effectively made based on B0.)
"B1 alpha", which we produced in April 2005, was the first tentative version in the development of "B1". It did not fully cover the spin-up period, and it did not include merging of precipitation with GPCP.
Regrettably, our processing of that version had a code error in reading the elevation data in order to correct temperature. That error resulted in too low temperature in many parts of the world. In addition, another error happened so that the separation between rain and snow took temperature information from another data set than the erroneous "B1 alpha". Thus rainfall and snowfall were not seriously biased, but were not consistent with the plan. The "B1 alpha" data set is not recommendable for sensitivity studies. We aplolgize for the trouble.
This is the second version in the development of "B1". We have corrected the errors found in the programs that were used in the production in "B1 alpha".
In addition, we modify the algorithm for precipitation rate so that the values is set to zero where monthly precipitation according to GPCC is zero. Also, in the processing of surface pressure we use topography data of ERA40 included as part of ERA40 data (Beljaars et al. 2004) in the ISLSCP2 collection (Beta release of the DVD-ROM set), which is more consistent with other variables from ERA40.
In addition, the data for shortwave and longwave radiation are provided for the spin-up period as well. Other variables are still provided for the main simulation period 1986 - 1995) only.
These GrADS CTL files require a binary data file "gtd.filepdef" which is the same one for the forcing data files available from the DODS server at COLA.
In our storage media, these files are first compressed by "gzip" and then archived by "tar". (Thus "*.tar" files contain many "*.nc.gz" files.)
Downward shortwave and longwave radiative fluxes of "B1 beta" forcing data (SWdown_b1beta, LWdown_b1beta) are made for the period from July 1982 to December 1995. It should be noted that the product for the the first year of the spin-up period (July 1982 to June 1983) is of the lower quality, because of availability of source data.
The primary data source for radiation
is the Surface Radiation Budget (SRB) "Release 2"
products produced by Stackhouse et al. at NASA Langley Research Center [LaRC]
(http://srb-swlw.larc.nasa.gov/).
The products cover the time period from July 1983 to October 1995 at
3-hour time intervals, on a 1-degree latitude/longitude grid.
(The grid is actually somewhat decimated in higher latitudes,
but we assume that the values at 1-degree grids derived by LaRC's standard
retrieval program as the product.)
There are two products "SW" and "QCSW" for shortwave,
and "LW" and "QCLW" for longwave.
We use here "SW Release 2" and "LW Release 2.1" as primary input data,
because the strategy of SRB production suggests that
"SW" and "LW" are supposed to have higher quality than "QCSW" and "QCLW",
and also because the 3-hourly values from "QCSW" are not included in the
official release from LaRC.
They did not cover November-December 1995 and part of the spin-up period of GSWP2 (July 1982 to June 1983). In addition, the SW product has data voids at the beginning and the end of the first month (July 1983) and the last month (October 1995) due to problems in pre-processing at LaRC (but it does not have similar problems in intermediate months unlike the preliminary product that entered the production of B0 forcing). The SW product also has occasional "missing" values.
Another data set, ISCCP-FD (Zhang et al., 2004;
http://isccp.giss.nasa.gov/projects/flux.html)
is also taken into account.
The product covers the time period from July 1983 to June 2001 at
3-hour time intervals, on a 2.5-degree quasi-equal-area grid,
without nominal data gaps.
Downward shortwave and longwave fluxes at the surface for the
period from July 1983 to December 1995 has been
interpolated to the 1-degree grid and added to GSWP2 input data sets
(SWdown_isccp, LWdown_isccp)
by Zhao and Dirmeyer at COLA in 2004
(see entry R3 of http://www.iges.org/gswp/sensitivity.html as updated 9 Sept. 2004).
Using these satellite-based data, we can cover the period of GSWP2 except the first year of the spin-up period (July 1982 to June 1983) without using forecast-model-based data. This is the strategy we chose for "B1 beta". The handling of the first year will be discussed later.
For shortwave, the values of satellite-based datasets are generally not much different from each other, naturally because they both use cloud information from ISCCP. It seems that the major difference is that SRB has higher spatial resolution while it has more data gaps. An identified problem of SRB SW is that it under-estimates the downward shortwave flux at the central part of Tibetan Plateau (Masuda, posted 2004), and the bias extends to a wide area of western China with higher ground elevation, according to comparison between SRB SW and empirical estimation based on sunshine duration (calibrated with ground-based observation) by Xu et al. (manuscript 2003). From discussions with SRB specialists, though not conclusive yet, it seems that the values of parameters assumed in this region was inadequate and yielded too low values of clear-air transmittance. It is not certain yet whether this bias exists in other regions of high ground elevation such as the Andes. This bias is not found in SRB QCSW or ISCCP-FD.
For longwave, significant difference is found, and it is also expected because they used different source for the profiles of temperature and humidity in the atmosphere. In particular, comparison with ground-based observation (Masuda, posted 2004) revealed that ISCCP-FD under-estimates downward longwave radiation in cold region in winter. Downward longwave radiation is sensitive to near-surface air temperature, and near-surface temperature inversion is prevalent in cold region in winter. It is likely that the TOVS retrieval which was used by ISCCP-FD does not capture the inversion and show higher near-surface temperature than actual, where the forecast model of NASA GEOS Reanalysis used by SRB provides more realistic profile there. It is not sure, however, whether or not Reanalysis gives better temperature and humidity profiles than TOVS in other conditions.
The choice between these data sets is difficult, especially in the light of additional information by Zhao and Dirmeyer (personal communication, 2004) that comparison with ground-based observations in North America are in favor of ISCCP-FD. Our tentative decision for "B1 beta" is to assume SRB as "truth", but to use "SRB QCSW" rather than "SRB SW" as the standard for shortwave over high terrains. For longwave, we use SRB "LW", thus it is not equivalent with B0 which used "QCLW", but this change is based on data availability and not on explicit consideration of data quality.
We started from the original products of SRB from LaRC and GSWP2 forcing data of ISCCP-FD from COLA.
The difference of time steps had to be accounted for.
GSWP2 requires 3-hour average values, while SRB products are nominally
instantaneous values at discrete time points 3 hours apart.
As done by Zhao and Dirmeyer (2003, Section 2.6),
we apply simple averaging for longwave,
and weighting based on geometrically determined insolation at the
top-of-atmosphere for shortwave.
The ISCCP-FD products are nominally 3-hour averages for the intervals
centered at 0 UTC, 3 UTC etc.
(see the entry dated 31 August 2004
of http://www.iges.org/gswp/faq.html).
In this production, we use them directly as proxies of instantaneous values
of SRB data.
"SRB-ISCCP hybrid" data are produced as follows. Monthly 3-hourly climatological averages at all grid boxes on land are calculated for both SRB (SW, LW) and ISCCP-FD data, and the ratio was used as scale factors. To avoid excessive correction, the scale factor was forced to be between 0.5 and 2, as done in SRB-NCEP hybrid data production for B0. Adjustments of time steps are done after this hybridization. The gaps of SRB data are filled with the SRB-ISCCP hybrid data.
The correction of shortwave over high terrain is done after the filling. Monthly climatological averages are calculated for SW and QCSW, and the scale factor is so calculated as the result retain the values of SW if the elevation (according to ISLSCP2 data) is lower than 1500 m, but it is scaled comparable to QCSW where higher than 3000 m, and linearly interpolated values between them in the intermediate range.
The first year of spin-up (July 1982 to June 1983) is filled with hybrid data based on NCEP Reanalysis 2. The strategy is the same as that of B0 forcing data (Zhao and Dirmeyer, 2003, Section 2.6): calculating the ratio between monthly 3-hourly average values of "observed" and "forecast" values, and using the ratio as the correction factor. As the "observed" data for shortwave radiation, we use SRB SW which have been corrected to match QCLW in high terrain. For longwave radiation, we use SRB Rel. 2.1 LW.
We have made the hybrid data for the period July 1983 - December 1995 as well, which are available on request.
Downward shortwave and longwave radiative fluxes of B0 forcing data (Zhao and Dirmeyer, 2003, Section 2.6) are, in principle, based on the SRB "Release 2". But the SRB products which entered into the production of B0 forcing via ISLSCP2 project were not the release version (not available then), but a preliminary version of SRB's "SW" and "QCLW" products for the period from January 1986 to October 1995. They did not cover the spin-up period of GSWP2 (July 1982 to December 1985) as well as November-December 1995. In addition, the SW product had data voids at the beginning and the end of every month.
To fill those gaps of SRB input, Zhao and Dirmeyer (2003) used the products of NCEP Reanalysis 2 (Kanamitsu et al., 2002). Radiative fluxes from 0--3 and 3--6 hour forecasts of NCEP R2 had been interpolated onto the 1-degree latitude-longitude grid. They intended to adjust the bias of NCEP R2 to match SRB by using the ratio between monthly 3-hourly climatological values as factors.
It seems, however, that their adjustment, or hybrid data production, did not work as intended. As we compared at the places where ground-based observations are available, much larger biases are found for the hybrid data than for SRB data at many places (Masuda, posted 2004). Also, areal averages of radiative fluxes for the spin-up period is significantly different from that of the main period (1986 - 1995) of GSWP2 (Kenji Tanaka, DPRI Kyoto Univ., presentation at GSWP2 meeting in Kyoto, September 2004).
As we reproduced SRB-NCEP hybrid data following the description by Zhao and Dirmeyer (2003), (using LaRC original SRB SW and LW combined with GSWP2 input SWdown_ncep and LWdown_ncep, thus not exactly the same as B0), no obvious bias was found in the graph of monthly values at several locations. Thus we suspect that the algorithm for the B0 data production was good, but some trouble occurred in actual data handling.
Forcing data for GSWP2 requires surface air pressure, air temperature and specific humidity at 2 m above ground, and wind speed at 10 m above ground. The B0 data of these variables are based on NCEP R2 with some adjustments (Zhao and Dirmeyer, 2003, Sections 2.3 - 2.5).
Later, ERA40 data became available for ISLSCP2 (Betts and Beljaars, 2003)
covering the period from 1986 to 1995 on the 1-degree grid.
The data are added to GSWP2 forcing data at COLA
(see entry M2 of http://www.iges.org/gswp/sensitivity.html as updated 9 Sept. 2004).
Statistical comparison between GSWP2 forcing data sets and synoptic
observations worldwide (Tanaka et al., presented 2005)
revealed that ERA40 has generally better quality than NCEP R2.
For wind speed, however, their comparison
showed that ERA40 under-estimates it and while NCEP R2 over-estimates it
(though only the latter is explicitly mentioned in their paper).
Accordingly, we consider that we should use ERA40 rather than NCEP R2, despite that it is not easy to extend it for the spin-up period. For "B1 beta", we cover the 10-year period (1986 - 1995) only.
As for wind speed, in spite of the suggestion of bias by Tanaka et al. we do not have enough information to make corrections. Therefore we consider the forcing data produced at COLA based on ERA40 (Wind_era) as part of our "B1 beta" without change.
For other variables, we use the forcing data produced at COLA (Tair_era, Qair_era and Psurf_era) as the source of ERA40 data, and process them as described in the following subsections.
As in B0, monthly mean temperature is given by CRU (Climate Research Unit, Univ. of East Anglia)data (New et al., 2000) with adjustment of difference of elevation between CRU and ISLSCP2 assuming the standard lapse rate of 6.5 K / km. But unlike B0, adjustment of variability within a month based on diurnal temperature range of CRU data is not attempted. Thus, the temperature data of "B1 beta" (Tair_b1beta) is ERA40 shifted by the difference of monthly values between elevation-adjusted CRU and ERA40.
The decision to avoid adjustment of variability is based on
the finding that it is the likely cause of excessive temperature variability
found in cold region
(reported by Kenji Tanaka
and included on 13 November 2003 in
http://www.iges.org/gswp/faq.html).
In B0 (Zhao and Dirmeyer, 2003, Section 2.3),
the ratio of the diurnal range between CRU and NCEP R2 is used
as a scale factor of all variability (not only diurnal variation
but also synoptic-scale variability).
The diurnal range of CRU is often much larger than
that of NCEP R2 in some regions where observing stations that entered
data production at CRU are sparse.
Therefore, the adjustment resulted in exaggeration of synoptic variability
in such areas.
The diurnal range of ERA40 has the same order of magnitude as that of NCEP R2.
Though we do not know whether ERA40 or CRU better represents diurnal range,
we consider it better to avoid applying correction based on CRU
indiscriminately worldwide.
The same caution may be said for mean temperature. We suspect that surface air temperature may be better represented by ERA40 and adjusting it to match CRU may be detrimental in such areas as the interior of Greenland. Nevertheless, we tentatively decide to use adjustment based on mean temperature of CRU everywhere, because we do not know well yet regional data quality of CRU and ERA40.
Our production of surface pressure data (Psurf_b1beta) is the same as in B0 (Zhao and Dirmeyer, 2003, Section 2.4) except that it uses ERA40 instead of NCEP R2.
The surface elevation of ERA40 is taken from ERA40 data (Beljaars et al. 2004) in the Beta release of the ISLSCP-2 DVD-ROM set. The file era40/E40_fixed_1d/E40_Z_fixed_1d_m00.asc contains values of geopotential, which are divided by 9.80665 to yield values of height. We expect that interpolation from ERA40's original grid to 1-degree grid for this data set is consistent with that of other variables in ERA40 for ISLSCP2. (For "B1 alpha", we interpolated the surface geopotential field by ourselves and the method was not consistent with ISLSCP2.)
Our production of surface pressure data (Qair_b1beta) is the same as in B0 (Zhao and Dirmeyer, 2003, Section 2.5) except that it uses ERA40 instead of NCEP R2, and our "B1 beta" version of temperature and pressure data are used instead of B0 data.
Our production of data sets of rainfall rate, snowfall rate and convective rainfall rate (Rainf_b1beta, Snowf_b1beta and Rainf_C_b1beta) is made by adjusting ERA40 to match raingauge-based data with compensation of undercatch by gauges. Merging with GPCP gauge-satellite combined data (done for B0 as described by Zhao and Dirmeyer 2003, Section 2.2) is not attempted for "B1 beta", not because we consider that we should not do it in B1, but because we want to check the effect of gauge correction before making the final B1.
The data sources are as listed below.
The time period is limited to 1986-1995, mainly because of availability of ERA40 data on the 1-degree grid. For this period GPCC (Global Precipitation Climatology Centre) data are also available, thus we do not need another raingauge data set of CRU.http://www.dwd.de/)
and included in ISLSCP2 (see http://islscp2.sesda.com/);The processing is as follows.
Convective rain rate (Rainf_C_b1beta) is calculated as Rainf_b1beta * Rainf_C_era / Rainf_era .
Here, we mainly mention logical difference. Quantitative difference will be reported in another occasion. Since we do not apply merging with GPCP data, our strategy of production of "B1 beta" is more similar to the "GPCCWC" data sets (Rainf_gpccwc and Snowf_gpccwc) produced by COLA for sensitivity study P2 than the B0 data sets (Rainf_gswp and Snowf_gswp).
We find that the gauge correction in GPCCWC (the difference between it and GPCC data) is excessive especially in middle latitudes where rainfall is more prevalent than snowfall. The production of GPCCWC (as well as of B0) took the wind speed at 10 m above ground of NCEP R2 directly to the formula of Motoya et al. (2003) without reduction to gauge height. Also, as mentioned in Subsection 3.1 above, NCEP R2 has positive bias in wind speed. These two factors are likely causes of the excessive correction in GPCCWC.
Another difference that can significantly contribute to systematic difference is separation between rain and snow. GPCCWC and B0 followed the separation made by the forecast model of NCEP R2. It is largely dependent on the temperature near the surface in the model. We consider that the separation should be consistent with the temperature that is considered effective at the surface height given by ISLSCP2. It is difficult to examine how this particular alteration contributes to the difference between GPCCWC and "B1 beta". Instead, to see the direct effect of the separation, we have compared the ratio of snowfall to total precipitation in our separation (before raingauge correction, in our trial version) and in ERA40 and NCEP R2 in long-term zonal averages. Our separation generally yields more snow (less rain) than ERA40. Combined with rainguage correction, it results in somewhat larger total precipitation than the case where ERA40 rain/snow separation would be used. Difference between ours and NCEP R2 is more subtle.
The use of NCEP R2 in GPCCWC versus ERA40 in "B1 beta" also makes some difference. But, as both are scaled by GPCC data, we consider it would not be so important in their time-averaged values.
http://www.jamstec.go.jp/frcgc/research/p2/masuda/radcmp/.