Skip to content

same code, new tables #6

Merged
merged 1 commit into from Sep 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
Expand Up @@ -3,14 +3,14 @@ title: 'CTBench Eval Project Notebook:'
author: "Your Name Here"
date: "`r format(Sys.time(), '%d %B %Y')`"
output:
pdf_document:
toc: true
number_sections: true
fig_caption: false
html_document:
toc: true
number_sections: true
df_print: paged
pdf_document:
toc: true
number_sections: true
fig_caption: false
subtitle: DAR Assignment 3 (Fall 2024)
---
```{r setup, include=FALSE}
Expand Down
275 changes: 251 additions & 24 deletions StudentNotebooks/Assignment03/dar-f24-assignment3-template.html
Expand Up @@ -2179,14 +2179,16 @@ <h2><span class="header-section-number">5.1</span> How do results differ
<caption>Differences by Model on CT-Pub</caption>
<colgroup>
<col width="20%" />
<col width="17%" />
<col width="16%" />
<col width="14%" />
<col width="14%" />
<col width="14%" />
<col width="13%" />
<col width="11%" />
<col width="11%" />
<col width="11%" />
</colgroup>
<thead>
<tr class="header">
<th align="left">model</th>
<th align="right">meanPrecision</th>
<th align="right">sePrecision</th>
<th align="right">meanRecall</th>
Expand All @@ -2197,12 +2199,40 @@ <h2><span class="header-section-number">5.1</span> How do results differ
</thead>
<tbody>
<tr class="odd">
<td align="right">0.4344882</td>
<td align="right">0.0084623</td>
<td align="right">0.5272768</td>
<td align="right">0.0104531</td>
<td align="right">0.4517983</td>
<td align="right">0.0072764</td>
<td align="left">gpt4-omni-ts</td>
<td align="right">0.4194773</td>
<td align="right">0.0165780</td>
<td align="right">0.5465613</td>
<td align="right">0.0206740</td>
<td align="right">0.4519953</td>
<td align="right">0.0145372</td>
</tr>
<tr class="even">
<td align="left">gpt4-omni-zs</td>
<td align="right">0.4117923</td>
<td align="right">0.0191232</td>
<td align="right">0.4988831</td>
<td align="right">0.0196749</td>
<td align="right">0.4250843</td>
<td align="right">0.0150289</td>
</tr>
<tr class="odd">
<td align="left">llama3-70b-in-ts</td>
<td align="right">0.4372443</td>
<td align="right">0.0146389</td>
<td align="right">0.5267929</td>
<td align="right">0.0209162</td>
<td align="right">0.4550647</td>
<td align="right">0.0137912</td>
</tr>
<tr class="even">
<td align="left">llama3-70b-in-zs</td>
<td align="right">0.4694388</td>
<td align="right">0.0167251</td>
<td align="right">0.5368701</td>
<td align="right">0.0222861</td>
<td align="right">0.4750490</td>
<td align="right">0.0146080</td>
</tr>
</tbody>
</table>
Expand All @@ -2227,21 +2257,26 @@ <h2><span class="header-section-number">5.2</span> How do results differ
meanRecall=mean(recall),
seRecall=std.error(recall),
meanF1=mean(f1),
sef1=std.error(f1))

kable(CT_Pub_MT_results.df, caption=&quot;Differences by Model and Subgroup on CT-pub&quot;)</code></pre>
sef1=std.error(f1))</code></pre>
<pre><code>## `summarise()` has grouped output by &#39;model&#39;. You can override using the
## `.groups` argument.</code></pre>
<pre class="r"><code>kable(CT_Pub_MT_results.df, caption=&quot;Differences by Model and Subgroup on CT-pub&quot;)</code></pre>
<table>
<caption>Differences by Model and Subgroup on CT-pub</caption>
<colgroup>
<col width="20%" />
<col width="17%" />
<col width="16%" />
<col width="14%" />
<col width="14%" />
<col width="14%" />
<col width="15%" />
<col width="21%" />
<col width="13%" />
<col width="11%" />
<col width="10%" />
<col width="9%" />
<col width="9%" />
<col width="9%" />
</colgroup>
<thead>
<tr class="header">
<th align="left">model</th>
<th align="left">trial_group</th>
<th align="right">meanPrecision</th>
<th align="right">sePrecision</th>
<th align="right">meanRecall</th>
Expand All @@ -2252,12 +2287,204 @@ <h2><span class="header-section-number">5.2</span> How do results differ
</thead>
<tbody>
<tr class="odd">
<td align="right">0.4344882</td>
<td align="right">0.0084623</td>
<td align="right">0.5272768</td>
<td align="right">0.0104531</td>
<td align="right">0.4517983</td>
<td align="right">0.0072764</td>
<td align="left">gpt4-omni-ts</td>
<td align="left">cancer</td>
<td align="right">0.3376333</td>
<td align="right">0.0399497</td>
<td align="right">0.5430899</td>
<td align="right">0.0392855</td>
<td align="right">0.3970881</td>
<td align="right">0.0344699</td>
</tr>
<tr class="even">
<td align="left">gpt4-omni-ts</td>
<td align="left">chronic kidney disease</td>
<td align="right">0.4430021</td>
<td align="right">0.0303595</td>
<td align="right">0.5625353</td>
<td align="right">0.0518243</td>
<td align="right">0.4789217</td>
<td align="right">0.0327797</td>
</tr>
<tr class="odd">
<td align="left">gpt4-omni-ts</td>
<td align="left">diabetes</td>
<td align="right">0.4315031</td>
<td align="right">0.0261183</td>
<td align="right">0.5984520</td>
<td align="right">0.0363788</td>
<td align="right">0.4815179</td>
<td align="right">0.0242987</td>
</tr>
<tr class="even">
<td align="left">gpt4-omni-ts</td>
<td align="left">hypertension</td>
<td align="right">0.4936892</td>
<td align="right">0.0421646</td>
<td align="right">0.5076353</td>
<td align="right">0.0570618</td>
<td align="right">0.4708821</td>
<td align="right">0.0340076</td>
</tr>
<tr class="odd">
<td align="left">gpt4-omni-ts</td>
<td align="left">obesity</td>
<td align="right">0.3882668</td>
<td align="right">0.0495100</td>
<td align="right">0.4659332</td>
<td align="right">0.0487458</td>
<td align="right">0.4034206</td>
<td align="right">0.0390612</td>
</tr>
<tr class="even">
<td align="left">gpt4-omni-zs</td>
<td align="left">cancer</td>
<td align="right">0.3593896</td>
<td align="right">0.0648699</td>
<td align="right">0.5177248</td>
<td align="right">0.0378228</td>
<td align="right">0.3884822</td>
<td align="right">0.0416444</td>
</tr>
<tr class="odd">
<td align="left">gpt4-omni-zs</td>
<td align="left">chronic kidney disease</td>
<td align="right">0.4498255</td>
<td align="right">0.0359125</td>
<td align="right">0.4961775</td>
<td align="right">0.0422165</td>
<td align="right">0.4550535</td>
<td align="right">0.0337143</td>
</tr>
<tr class="even">
<td align="left">gpt4-omni-zs</td>
<td align="left">diabetes</td>
<td align="right">0.4398874</td>
<td align="right">0.0275178</td>
<td align="right">0.5670120</td>
<td align="right">0.0350287</td>
<td align="right">0.4747453</td>
<td align="right">0.0243584</td>
</tr>
<tr class="odd">
<td align="left">gpt4-omni-zs</td>
<td align="left">hypertension</td>
<td align="right">0.4110457</td>
<td align="right">0.0525773</td>
<td align="right">0.4404236</td>
<td align="right">0.0524825</td>
<td align="right">0.3916399</td>
<td align="right">0.0329269</td>
</tr>
<tr class="even">
<td align="left">gpt4-omni-zs</td>
<td align="left">obesity</td>
<td align="right">0.3678517</td>
<td align="right">0.0488932</td>
<td align="right">0.4016209</td>
<td align="right">0.0472745</td>
<td align="right">0.3598584</td>
<td align="right">0.0359429</td>
</tr>
<tr class="odd">
<td align="left">llama3-70b-in-ts</td>
<td align="left">cancer</td>
<td align="right">0.4093769</td>
<td align="right">0.0321481</td>
<td align="right">0.5666599</td>
<td align="right">0.0483330</td>
<td align="right">0.4519619</td>
<td align="right">0.0284682</td>
</tr>
<tr class="even">
<td align="left">llama3-70b-in-ts</td>
<td align="left">chronic kidney disease</td>
<td align="right">0.4538399</td>
<td align="right">0.0367768</td>
<td align="right">0.5158242</td>
<td align="right">0.0568843</td>
<td align="right">0.4591601</td>
<td align="right">0.0350896</td>
</tr>
<tr class="odd">
<td align="left">llama3-70b-in-ts</td>
<td align="left">diabetes</td>
<td align="right">0.4571022</td>
<td align="right">0.0248011</td>
<td align="right">0.5708732</td>
<td align="right">0.0343273</td>
<td align="right">0.4862324</td>
<td align="right">0.0235926</td>
</tr>
<tr class="even">
<td align="left">llama3-70b-in-ts</td>
<td align="left">hypertension</td>
<td align="right">0.4983549</td>
<td align="right">0.0370043</td>
<td align="right">0.4818363</td>
<td align="right">0.0599463</td>
<td align="right">0.4657995</td>
<td align="right">0.0376744</td>
</tr>
<tr class="odd">
<td align="left">llama3-70b-in-ts</td>
<td align="left">obesity</td>
<td align="right">0.3603799</td>
<td align="right">0.0328818</td>
<td align="right">0.4540276</td>
<td align="right">0.0437946</td>
<td align="right">0.3865056</td>
<td align="right">0.0317837</td>
</tr>
<tr class="even">
<td align="left">llama3-70b-in-zs</td>
<td align="left">cancer</td>
<td align="right">0.4138544</td>
<td align="right">0.0414752</td>
<td align="right">0.6322974</td>
<td align="right">0.0492822</td>
<td align="right">0.4836421</td>
<td align="right">0.0366250</td>
</tr>
<tr class="odd">
<td align="left">llama3-70b-in-zs</td>
<td align="left">chronic kidney disease</td>
<td align="right">0.5265988</td>
<td align="right">0.0432749</td>
<td align="right">0.5701615</td>
<td align="right">0.0637368</td>
<td align="right">0.5070008</td>
<td align="right">0.0362418</td>
</tr>
<tr class="even">
<td align="left">llama3-70b-in-zs</td>
<td align="left">diabetes</td>
<td align="right">0.4925006</td>
<td align="right">0.0255036</td>
<td align="right">0.5353139</td>
<td align="right">0.0350723</td>
<td align="right">0.4980289</td>
<td align="right">0.0246477</td>
</tr>
<tr class="odd">
<td align="left">llama3-70b-in-zs</td>
<td align="left">hypertension</td>
<td align="right">0.5075860</td>
<td align="right">0.0488254</td>
<td align="right">0.5109607</td>
<td align="right">0.0630818</td>
<td align="right">0.4757989</td>
<td align="right">0.0373848</td>
</tr>
<tr class="even">
<td align="left">llama3-70b-in-zs</td>
<td align="left">obesity</td>
<td align="right">0.3884561</td>
<td align="right">0.0340609</td>
<td align="right">0.4418456</td>
<td align="right">0.0460536</td>
<td align="right">0.3914692</td>
<td align="right">0.0307577</td>
</tr>
</tbody>
</table>
Expand Down
Binary file modified StudentNotebooks/Assignment03/dar-f24-assignment3-template.pdf
Binary file not shown.