Author Topic: Regional Statistics - Discrepancy on Total Clinchable Mileage  (Read 26735 times)

0 Members and 1 Guest are viewing this topic.

Offline yakra

  • TM Collaborator
  • Hero Member
  • *****
  • Posts: 4422
  • Last Login:Yesterday at 10:18:34 pm
  • I like C++
Re: Regional Statistics - Discrepancy on Total Clinchable Mileage
« Reply #30 on: February 15, 2022, 03:33:24 pm »
systemMileageByRegion & clinchedSystemMileageByRegion
Code: [Select]
server=$1

echo `date`": Downloading user logs"
mkdir -p userlogs.tmp
for u in `curl $server/stats/allbyregionactivepreview.csv 2>/dev/null | cut -f1 -d, | tail -n +2 | head -n -1`; do
  wget -O userlogs.tmp/$u.log "$server/logs/users/$u.log" 2>/dev/null
  echo -n "$u "
done; echo
echo `date`": Done."

for sys in `tail -n +2 /home/yakra/tm/HighwayData/systems.csv | egrep -v '^#|^$|;devel$' | cut -f1 -d';'`; do
  data=`curl $server/stats/$sys-all.csv 2>/dev/null`
  regions=`echo "$data" | head -n 1 | sed -e 's~^Traveler,Total,~~' -e 's~,~ ~g'`
  users=`echo "$data" | head -n -1 | tail -n +2 | cut -f1 -d,`
  ut=`echo $users | wc -w`
  for rg in $regions; do
    un=0
    for u in $users; do
      echo -en "$sys in $rg for $u                "

      # web
      web=$(curl "$server/user/system.php?units=miles&u=$u&sys=$sys&rg=$rg" 2>/dev/null \
            | egrep '[0-9.]+ of [0-9.]+ miles' \
            | sed -r 's~.*>([0-9.]+ of [0-9.]+) miles.*~\1~')
      web1=`echo "$web" | cut -f1 -d' '`
      web2=`echo "$web" | cut -f3 -d' '`

      # log
      if [ `echo $regions | wc -w` == 1 ]; then
        log=$(grep "^System $sys ([a-z]\+) overall: " userlogs.tmp/$u.log | sed -r 's~.*: (.*) mi \(.*~\1~')
      else
        beg=$(grep -m 1 -n "^System $sys by region:$" userlogs.tmp/$u.log | cut -f1 -d:)
        end=$(grep -m 1 -n "^System $sys by route (traveled routes only):$" userlogs.tmp/$u.log | cut -f1 -d:)
        log=$(tail -n +$beg userlogs.tmp/$u.log | head -n $(expr $end - $beg) | grep "^  $rg: " | sed -r 's~.*: (.*) mi \(.*~\1~')
      fi
      log1=`echo "$log" | cut -f1 -d' '`
      log2=`echo "$log" | cut -f3 -d' '`

      #compare
      if   [ "$web1" != "$log1" ]; then
        echo -e "\n  web: $web"
        echo -e "  log: $log"
      elif [ "$web2" != "$log2" ]; then
        echo -e "\n  web: $web"
        echo -e "  log: $log"
        break
      else echo -en '\r'
      fi

      un=`expr $un + 1`
    done
  done
done
echo `date`": Finished."
I'm going to do 3 2 runs of this:
noreaster (normal .sql file) <-- a site update is going to happen, and invalidate the results. Just lab2 will be sufficient.
• lab2, tm.conf -> tm.conf.updating (normal .sql file)
• lab2, tm.conf -> tm.conf.standard (all FLOAT -> DOUBLE)
« Last Edit: February 15, 2022, 09:41:25 pm by yakra »
Sri Syadasti Syadavaktavya Syadasti Syannasti Syadasti Cavaktavyasca Syadasti Syannasti Syadavatavyasca Syadasti Syannasti Syadavaktavyasca

Offline Jim

  • TM Collaborator
  • Hero Member
  • *****
  • Posts: 2856
  • Last Login:Yesterday at 10:14:42 pm
Re: Regional Statistics - Discrepancy on Total Clinchable Mileage
« Reply #31 on: February 15, 2022, 09:51:25 pm »
We might want DECIMAL instead of FLOAT or DOUBLE here.  I did not know about it, but I think we can force a certain number of digits after the decimal point that way.

Offline yakra

  • TM Collaborator
  • Hero Member
  • *****
  • Posts: 4422
  • Last Login:Yesterday at 10:18:34 pm
  • I like C++
Re: Regional Statistics - Discrepancy on Total Clinchable Mileage
« Reply #32 on: February 15, 2022, 10:28:10 pm »
Would that require further changes to the tables, like setting the precision or number of decimal places, etc?

Sure, let's change that where needed.  More memory/disk but we have plenty of that.  I don't think we want a blanket FLOAT->DOUBLE, just placed where numbers are likely to get larger.
Good news is, just changing overallMileageByRegion, clinchedOverallMileageByRegion, systemMileageByRegion & clinchedSystemMileageByRegion appears to do the trick.
Larger numbers will be here, and with 58573 rows between the 4 tables, that's way less than the (clinched)(Connected)Routes tables, with 961038 combined.
I could script around for the unlikely diff there... but that'd take literally days. :D Edit: (I really like saying Edit:) I won't attempt this (thankfully), because there's no way to disambiguate some ConnectedRoutes, such as I-265FutLou in KY & IN.
« Last Edit: February 16, 2022, 11:37:28 am by yakra »
Sri Syadasti Syadavaktavya Syadasti Syannasti Syadasti Cavaktavyasca Syadasti Syannasti Syadavatavyasca Syadasti Syannasti Syadavaktavyasca

Offline Jim

  • TM Collaborator
  • Hero Member
  • *****
  • Posts: 2856
  • Last Login:Yesterday at 10:14:42 pm
Re: Regional Statistics - Discrepancy on Total Clinchable Mileage
« Reply #33 on: February 16, 2022, 07:09:37 am »
Yes, it looks like we would want something like DECIMAL(7,2) but since I just learned of this option I don't know for sure.  If some changes to DOUBLE have fixed the immediate problem, I'd say let's open an Issue (if there's not one) to remind to investigate further which columns might be more appropriate as a DECIMAL.