SimThyr

We start by opening SimThyr (V.4.0.6) and creates a .tsv file (See this presentation https://www.glensbo.dk/circle/2023/02/06/create-and-use-scenarios-in-simthyr/)

Importing the tsv fil (Having found it place a tick in the box to the right click in the More option above and pick: Copy Folder Path to Clipboard. For me this gives: ~/Documents/RFolder/Markdown/CIRCLE/Pilo_Kubota_Year/KUBOTA_xml/ and then you just need to add the filename)

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
Kubota5_tsv <- read.table(file = '~/Documents/RFolder/Markdown/CIRCLE/Pilo_Kubota_Year/KUBOTA_xml/Kubota_5.tsv', sep = '\t', header = TRUE)

Creating further columns for later presentations

Specifically values for SPINA-GD, SPINA-GT and TSHI

What kind of columns and information is in the Kubota5 file?

str(Kubota5_tsv)
## 'data.frame':    630722 obs. of  11 variables:
##  $ i   : int  NA 1 2 3 4 5 6 7 8 9 ...
##  $ t   : chr  "day h:m:s" "1900-01-01 00:00:00" "1900-01-01 00:01:40" "1900-01-01 00:03:20" ...
##  $ TRH : chr  "ng/l" "2500.00" "3010.4316" "3673.7146" ...
##  $ pTSH: chr  "mU/l" "4.00" "4.00" "4.00" ...
##  $ TSH : chr  "mU/l" "12.7705" "12.7705" "12.9466" ...
##  $ TT4 : chr  "nmol/l" "46.3769" "46.3769" "46.3769" ...
##  $ FT4 : chr  "pmol/l" "6.7203" "6.7203" "6.7203" ...
##  $ TT3 : chr  "nmol/l" "1.8818" "1.8818" "1.8818" ...
##  $ FT3 : chr  "pmol/l" "3.1311" "3.1311" "3.1311" ...
##  $ cT3 : chr  "pmol/l" "4495.876" "4495.876" "4495.876" ...
##  $ X   : logi  NA NA NA NA NA NA ...

Some rubbish columns and needed data and important a 2 row header which I would like to merge into a one row header. Picked this solution:

#https://stackoverflow.com/questions/17797840/reading-two-line-headers-in-r
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.4.0     ✔ purrr   1.0.1
## ✔ tibble  3.1.8     ✔ stringr 1.5.0
## ✔ tidyr   1.3.0     ✔ forcats 1.0.0
## ✔ readr   2.1.3     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
# change to csv format
write.csv(Kubota5_tsv,'~/Documents/RFolder/MarkDown/CIRCLE/Pilo_Kubota_Year/KUBOTA_XML/Kubota5.csv')
header <- sapply(read.csv("~/Documents/RFolder/Markdown/CIRCLE/Pilo_Kubota_Year/KUBOTA_xml/Kubota5.csv",
                          nrow=2,
                          header = FALSE),
                 paste,
                 collapse="_")
result <- read.csv("~/Documents/RFolder/Markdown/CIRCLE/Pilo_Kubota_Year/KUBOTA_XML/Kubota5.csv", skip=2, col.names=header)

str(result)
## 'data.frame':    630720 obs. of  12 variables:
##  $ NA_1       : int  3 4 5 6 7 8 9 10 11 12 ...
##  $ i_NA       : int  2 3 4 5 6 7 8 9 10 11 ...
##  $ t_day.h.m.s: chr  "1900-01-01 00:01:40" "1900-01-01 00:03:20" "1900-01-01 00:05:00" "1900-01-01 00:06:40" ...
##  $ TRH_ng.l   : num  3010 3674 5036 2827 2380 ...
##  $ pTSH_mU.l  : num  4 4 4 4 4 4 4 4 4 4 ...
##  $ TSH_mU.l   : num  12.8 12.9 13.2 13.6 13.7 ...
##  $ TT4_nmol.l : num  46.4 46.4 46.4 46.4 46.4 ...
##  $ FT4_pmol.l : num  6.72 6.72 6.72 6.72 6.72 ...
##  $ TT3_nmol.l : num  1.88 1.88 1.88 1.88 1.88 ...
##  $ FT3_pmol.l : num  3.13 3.13 3.13 3.13 3.13 ...
##  $ cT3_pmol.l : num  4496 4496 4496 4496 4496 ...
##  $ X_NA       : logi  NA NA NA NA NA NA ...

There we are - and then I calculate additional columns

library(SPINA)
result <- result %>%  mutate(ratio = FT3_pmol.l/FT4_pmol.l,
                             TSH_Sum = (FT4_pmol.l*0.52)+(FT3_pmol.l*0.38)+((FT4_pmol.l+FT3_pmol.l)*0.1),
                             TSH_TSH_Sum = TSH_mU.l/TSH_Sum,
                             SPINA_GT = SPINA.GT(result$TSH_mU.l, result$FT4_pmol.l),
                             SPINA_GD = SPINA.GD(result$FT4_pmol.l, result$FT3_pmol.l),
                             TSHI = estimated.TSHI(result$TSH_mU.l, result$FT4_pmol.l),
                             TRH_TSH = TRH_ng.l/TSH_mU.l,
                             TT4_FT4 = TT4_nmol.l/FT4_pmol.l,
                             TT3_FT3 = TT3_nmol.l/FT3_pmol.l,
                             sqrtTSH = sqrt(TSH_mU.l),
                             sqrtTRH = sqrt(TRH_ng.l),
                             sqrtFT4 = sqrt(FT4_pmol.l),
                             sqrtFT3 = sqrt(FT3_pmol.l),
                             sqrtTT4 = sqrt(TT4_nmol.l))

str(result)
## 'data.frame':    630720 obs. of  26 variables:
##  $ NA_1       : int  3 4 5 6 7 8 9 10 11 12 ...
##  $ i_NA       : int  2 3 4 5 6 7 8 9 10 11 ...
##  $ t_day.h.m.s: chr  "1900-01-01 00:01:40" "1900-01-01 00:03:20" "1900-01-01 00:05:00" "1900-01-01 00:06:40" ...
##  $ TRH_ng.l   : num  3010 3674 5036 2827 2380 ...
##  $ pTSH_mU.l  : num  4 4 4 4 4 4 4 4 4 4 ...
##  $ TSH_mU.l   : num  12.8 12.9 13.2 13.6 13.7 ...
##  $ TT4_nmol.l : num  46.4 46.4 46.4 46.4 46.4 ...
##  $ FT4_pmol.l : num  6.72 6.72 6.72 6.72 6.72 ...
##  $ TT3_nmol.l : num  1.88 1.88 1.88 1.88 1.88 ...
##  $ FT3_pmol.l : num  3.13 3.13 3.13 3.13 3.13 ...
##  $ cT3_pmol.l : num  4496 4496 4496 4496 4496 ...
##  $ X_NA       : logi  NA NA NA NA NA NA ...
##  $ ratio      : num  0.466 0.466 0.466 0.466 0.466 ...
##  $ TSH_Sum    : num  5.67 5.67 5.67 5.67 5.67 ...
##  $ TSH_TSH_Sum: num  2.25 2.28 2.33 2.4 2.42 ...
##  $ SPINA_GT   : num  0.62 0.619 0.616 0.613 0.612 ...
##  $ SPINA_GD   : num  43.1 43.1 43.1 43.1 43.1 ...
##  $ TSHI       : num  3.45 3.46 3.48 3.51 3.52 ...
##  $ TRH_TSH    : num  236 284 381 208 173 ...
##  $ TT4_FT4    : num  6.9 6.9 6.9 6.9 6.9 ...
##  $ TT3_FT3    : num  0.601 0.601 0.601 0.601 0.601 ...
##  $ sqrtTSH    : num  3.57 3.6 3.63 3.69 3.71 ...
##  $ sqrtTRH    : num  54.9 60.6 71 53.2 48.8 ...
##  $ sqrtFT4    : num  2.59 2.59 2.59 2.59 2.59 ...
##  $ sqrtFT3    : num  1.77 1.77 1.77 1.77 1.77 ...
##  $ sqrtTT4    : num  6.81 6.81 6.81 6.81 6.81 ...

Next I want to get rid of col 1 and 2 and col 12.

result <- subset(result[c(3:11,13:26)])

str(result)
## 'data.frame':    630720 obs. of  23 variables:
##  $ t_day.h.m.s: chr  "1900-01-01 00:01:40" "1900-01-01 00:03:20" "1900-01-01 00:05:00" "1900-01-01 00:06:40" ...
##  $ TRH_ng.l   : num  3010 3674 5036 2827 2380 ...
##  $ pTSH_mU.l  : num  4 4 4 4 4 4 4 4 4 4 ...
##  $ TSH_mU.l   : num  12.8 12.9 13.2 13.6 13.7 ...
##  $ TT4_nmol.l : num  46.4 46.4 46.4 46.4 46.4 ...
##  $ FT4_pmol.l : num  6.72 6.72 6.72 6.72 6.72 ...
##  $ TT3_nmol.l : num  1.88 1.88 1.88 1.88 1.88 ...
##  $ FT3_pmol.l : num  3.13 3.13 3.13 3.13 3.13 ...
##  $ cT3_pmol.l : num  4496 4496 4496 4496 4496 ...
##  $ ratio      : num  0.466 0.466 0.466 0.466 0.466 ...
##  $ TSH_Sum    : num  5.67 5.67 5.67 5.67 5.67 ...
##  $ TSH_TSH_Sum: num  2.25 2.28 2.33 2.4 2.42 ...
##  $ SPINA_GT   : num  0.62 0.619 0.616 0.613 0.612 ...
##  $ SPINA_GD   : num  43.1 43.1 43.1 43.1 43.1 ...
##  $ TSHI       : num  3.45 3.46 3.48 3.51 3.52 ...
##  $ TRH_TSH    : num  236 284 381 208 173 ...
##  $ TT4_FT4    : num  6.9 6.9 6.9 6.9 6.9 ...
##  $ TT3_FT3    : num  0.601 0.601 0.601 0.601 0.601 ...
##  $ sqrtTSH    : num  3.57 3.6 3.63 3.69 3.71 ...
##  $ sqrtTRH    : num  54.9 60.6 71 53.2 48.8 ...
##  $ sqrtFT4    : num  2.59 2.59 2.59 2.59 2.59 ...
##  $ sqrtFT3    : num  1.77 1.77 1.77 1.77 1.77 ...
##  $ sqrtTT4    : num  6.81 6.81 6.81 6.81 6.81 ...

Now the data frame is in a format that I can use for different visualisations.