From a18f8ce79f94377ba5bd68c34cecf2afc7f84c9e Mon Sep 17 00:00:00 2001 From: parks14 Date: Thu, 12 Dec 2024 11:26:14 -0500 Subject: [PATCH 1/3] submission --- .../parks14_final_notebook.Rmd | 377 +++ .../parks14_final_notebook.nb.html | 2445 +++++++++++++++++ .../parks14_final_notebook.pdf | Bin 0 -> 2624473 bytes 3 files changed, 2822 insertions(+) create mode 100644 StudentNotebooks/Assignment08_FinalProjectNotebook/parks14_final_notebook.Rmd create mode 100644 StudentNotebooks/Assignment08_FinalProjectNotebook/parks14_final_notebook.nb.html create mode 100644 StudentNotebooks/Assignment08_FinalProjectNotebook/parks14_final_notebook.pdf diff --git a/StudentNotebooks/Assignment08_FinalProjectNotebook/parks14_final_notebook.Rmd b/StudentNotebooks/Assignment08_FinalProjectNotebook/parks14_final_notebook.Rmd new file mode 100644 index 0000000..d96d03d --- /dev/null +++ b/StudentNotebooks/Assignment08_FinalProjectNotebook/parks14_final_notebook.Rmd @@ -0,0 +1,377 @@ +--- +title: "Data Analytics Research CT-Eval Final Report" +author: "Samuel Park" +date: "Fall 2024" +output: + pdf_document: + toc: yes + toc_depth: '3' + html_notebook: default + html_document: + toc: yes + toc_depth: 3 + toc_float: yes + number_sections: yes + theme: united +--- +# DAR Project and Group Members + +* Project name: CT Eval +* Project team members: Ziyi Baom, Corey Curran, Xiheng Liu, Tianhao Zhao (Victor), Tianyan Lin, Mingyang Li, Samuel Park, Soumeek Mishra, Yashas Balaji + +# 0.0 Preliminaries. + +There are no required packages required for this notebook. Linked .Rmd and .R files in other github repositories will have their own technical instructions and comments respectively. + +# 1.0 Project Introduction + +The project CTEval consists of analytical and technical methods, in addition to a R shiny app, to analyze the ability of large language models that are tasked with both the generation, evaluation, and benchmarking of producing reference features of clinical trials (CT) given CT metadata. Specifically we focused on prompt engineering to make LLMs produce these reference features and benchmark these generated features against the true values of known CTs. The evaluation team focused on analyzing these result to find trends specific to certain features, models, and other independent variables specific to this project. Additionally, What this notebook will focus on is the specific translation of the code originally in python to R and the implemenetation of the evaluation and benchmarking of LLMs into an R shiny app. For context the R shiny app is a web app built to provide a platform for those in the CT domain to make use of a the aforementioned features in a user-friendly manner. + +# 2.0 Organization of Report + +This report is organize as follows: + +* Section 3.0. Translation of the CTEval code: The original codebase is written in Python where all of the generation, evaluation, and benchmarking of LLMs is performed. To allow for R code to perform these same functionalities, I focused on translating the pertinent code functions and files to R, such that these scripts can be run on R, in effort to make the process of data generationa and analysis more concise. + +* Section 4.0: Implementation of Evaluation and Benchmarking in R shiny App: This section will cover the results of implementing the evaluation and benchmarking features into the R shiny app. + +* Section 4.0: Hallucination Mitigation and Prompt engineering for Llama: This section will cover the methods used to prompt engineer to obtain viable results for the evaluation section, specifically from the Llama LLM. + +* Section 6.0 Overall conclusions and suggestions + + + +# 3.0 Finding 1: Translation of CTEval codebase from Python to R + +The primary goal was to translate the CTEval codebase from Python to R to enhance compatablity with existing R-based workflows. Some questions driving this effor were, How effecitively can python logic and constructs be translated to R? What modifications were necessary to preserve performance and functionality? + +The approach consisted of analysis of the python code to identify core functionalities, including data manipualtion, feature matching, and evaluation logic. By leveraging R specific libraries like dplyr and purrr I was albe to mimic the pythonic logic. + +The outcome resulted in preservation of the functionality of the python code which were successfully replicated in R. + + +## 3.1 Data, Code, and Resources + +The main datasets that were used are the CT_Pub_updated.df and CT_Repo_updated.df dataframes. Specifics of each dataframe can be analyzed in the linked .Rmd file. + +1. CTBench_LLM_prompt.Rmd is the Rmd containing the R code with markdown explanations on the data preparation, function translation, and api calls of the translated code. ALL the code in this section can be found in the following: +[CTBench_LLM_prompt.Rmd link](https://github.rpi.edu/DataINCITE/DAR-CTEval-F24/blob/main/StudentNotebooks/Assignment08_FinalProjectNotebook/CTBench_LLM_promt.Rmd). + + +## 3.2 Contribution + +The code translated for this section is work done solely by me. However, Soumeek Mishra, also worked on the translation of the the codebase from Python to R, and his could should contain slight differences in the ability to make API calls to a separate set of LLMs. + + +## 3.3 Methods Description + +The work I did for this section did not work with any anlytical methods. + +Methods that I did use for the translation included analyzing the function and goals of the original python codebase, researching into what libraries could provide similar support as structure used in python (API calls, pandas data structure), and lastly using these to produce R code that would be able to emulate the results of the original codebase. + +## 3.4 Result and Discussion + +This section will contain the results of the major R functions derived from translating the CTEval codebase to R. For reference the source file can be found here: [CTBench_LLM_prompt.Rmd link](https://github.rpi.edu/DataINCITE/DAR-CTEval-F24/blob/main/StudentNotebooks/Assignment08_FinalProjectNotebook/CTBench_LLM_promt.Rmd) + +The following is a result of running the generation prompt on an LLM for both single-shot and triple-shot generation. Specifically run on trial NCT00395746 using the generation model, Meta-Llama-3.1-8B-Instruct. + +![Single Generation](images/Screenshot_2024-11-29_at_7.23.50PM.png) + +Batch Generation is the process of obtaining the generated results for different trials in a "batch" or in bulk. Specifically, in such a way such that all features are generated for the whole dataset of trials. The inputs include the dataframe CT_Pub_updated.df or CT_Repo_updated.df and the specific model used to generated the candidate features. + +When running batch generation on multiple CT trials, the resulting metadata and generation result is stored in a dataframe with the following format. Additional columns are added for evaluation and benchmarking. In more detail, whereas, the original CTBench representation contains the NCTid, generation model, and generated candidate features, the dataframe I developed also contains the matching model, length of matches, length of unmatched references, length of unmatched candidate features, precision, recall, and f1 columns. This is done to provide a common dataframe that contains the majority of information trials which can be parsed to create more use-case friendly dataframes.
+![Batch Generation1](images/Screenshot_2024-11-29_at_7.24.00PM.png) +![Batch Generation2](images/Screenshot_2024-11-29_at_7.24.07PM.png) +![Batch Generation3](images/Screenshot_2024-11-29_at_7.24.16PM.png) + +This is the example of the output after running the evaluation prompt on a single trial (using generated output from single generation, with Meta-Llama-3.1-8B-Instruct as the matching LLM). +The output consits of a R dataframe object with the matched features: +![Single Evaluation 1](images/Screenshot_2024-11-29_at_7.24.32PM.png) + +A second portion for the unmatched reference features: +![Single Evaluation 2](images/Screenshot_2024-11-29_at_7.24.39PM.png) + +The last portion for the unmatched candidate features: +![Single Evaluation 3](images/Screenshot_2024-11-29_at_7.24.50PM.png) + +Batch evaluation, similar to batch generation, runs the evaluation algorithm on all trials contained in the pertinent dataframe. + +When running batch evaluation the the length of the matches, unmatched reference features, and unmatched candidate features are stored in the aforementioned dataframe created when running batch generation: +![Batch Evaluation](images/Screenshot_2024-11-29_at_7.25.08PM.png) + +Lastly, one can perform benchmarking to retreive the associated precision, recall, and f1 scores, for the generations of multiple different trials' generation and evaluation. Similar to both previous examples of batch generation and evaluation, the benchmark metrics will be derived for all trials in the pertinent dataframe, assuming that the relevant information is available. Specifically, the columns of len_matches, len_reference, and len_candidate are populated. +![Batch Benchmarking](images/Screenshot_2024-11-29_at_7.25.32PM.png) + + + +## 3.5 Conclusions, Limitations, and Future Work. + +The translation of the CTEval codebase from Python to R significantly enhances its integration with existing R-based workflows in clinical trial evaluations. While the core functionality of the Python code was successfully preserved, the translation lacks sufficient abstraction, which limits its scalability and ease of use. Future work should focus on creating an object-oriented design that encapsulates all functionalities, enabling better flexibility and extensibility. Additionally, performance benchmarking and optimization should be conducted, especially for handling large datasets. + + + +# 4.0 Finding 2: Evaluation Functionality in CTBS App + +The major finding was the successful integration of evaluation and benchmarking features into an R Shiny app, enabling an interactive and dynamic environment for analyzing data. The primary questions driving this effort were: How effectively can R Shiny support real-time data generation and benchmarking? How can evaluation metrics be seamlessly incorporated into a user-friendly interface? + +The approach involved designing a user-centric interface that could display key evaluation metrics and benchmark results dynamically. Core functionalities like data manipulation and metric calculation were implemented using R Shiny’s reactive framework. + +The outcome is a fully functional Shiny app that allows users to evaluate and compare data efficiently. It provides real-time updates and intuitive controls, enhancing both usability and analytical capabilities. + +## 4.1 Data, Code, and Resources + +The main datasets that were used are the CT_Pub_updated.df and CT_Repo_updated.df dataframes. Specifics of each dataframe can be analyzed in the linked app.R file. + +1. app.R is the main R file containing both the server logic and frontend GUI code [app.R link](https://github.rpi.edu/DataINCITE/DAR-CTBSApp-F24/blob/main/app.R). Note that all the output in the following sections can be reproduced with the code in the provided link. + +The web application can be simply launched locally with the following command: +```{r, eval=FALSE} +shiny::runApp("app.R") +``` + +## 4.2 Contribution + +This app was built in collaboration with Xiheng Liu and Tianyan Lin. Both Xiheng and Tianyan contributed to the bulk of the web application development, developing both the structure and implementations of LLM generation. + +Given the codebase structure developed by the two, I implemented the LLM evaluation and benchmarking capabilities. Given that the codebase has lots of moving parts, I worked closely in conjunction with both Xiheng and Tianyan. + +## 4.3 Methods Description + +The development of the LLM evaluation and benchmarking features did not rely on analytical methods but instead focused on adapting the functionality to an R Shiny app environment. This process involved careful consideration of how the interactive nature of Shiny apps could be leveraged to enhance usability. + +The approach began with analyzing the goals of the original evaluation framework, particularly how user inputs and dynamic updates would be handled. One of the key challenges was ensuring smooth data flow between reactive elements, such as user-generated prompts and real-time results. The intricacies of managing reactivity and state within the Shiny app required precise structuring of the code to avoid unintended updates or performance lags. + +Much of the foundational work from Section 3 was reused to streamline the implementation. Functions developed earlier were integrated to handle core operations like data manipulation and evaluation logic, allowing the focus to shift toward building a responsive, user-friendly interface. This method ensured that the essential features of the evaluation process were preserved while adapting them to the interactive capabilities of the Shiny app. + + + +## 4.4 Result and Discussion +This section will contain a step-by-step walkthrough of the Evaluation and Benchmarking portion of the R Shiny app. + +When first loading into the web application, there is a navigation bar located on the top of the screen that contains an "Evaluate" tab amongst the other functionalities included in the app. + + +On the home screen, there is a "Evaluate" Tab on the navigation bar located on the top of the page. By clicking on the tab, one can navigate the Evaluation portion of the app. +![Home Screen](images/Screenshot 2024-11-30 at 9.46.48AM.png) + +On the evaluation page there are two sidebar tabs. The first is the Options Tab which contains a Dropdown menu for the Evaluator LLM choice which is set to "Meta-Llama-3.1-8B-Instruct", a "Run Evaluation" button to perform the evaluation and benchmarking, and lastly a textbox containing the generated candidate features from the "Generate Descriptors" portion of the app. +![Eval Page](images/Screenshot 2024-11-30 at 9.48.16AM.png) + +The Evaluator LLM List contains all the supported LLMs that can be used to evaluate the generated candidate descriptors against the specified Clinical Trial's original reference features. There are multiple prompts used for different models to mitigate issues in the evaluation stage. +![LLM List](images/Screenshot 2024-11-30 at 9.48.36AM.png) + +When the "Run Evalation" button is clicked, multiple steps are performed by the server logic of the application. First the metadata of the trial selected in the "Specify Trial" section of the app is parsed. With this parsed information multiple different server and user prompts are constructed depending on the Evaluating LLM. Next an API call is made to the specified LLM. Once the output is received, multiple error checks and helper functions are used to either fix up any invalid JSON outputs or retry the evaluation. These results are then stored in "Reactive Vals" which are dynamic types provided by the shiny library. + +Once the evaluation completes running, we can navigate to the "Report Page". This page includes, multiple different results and metrics. + +The first of which are the True Matches, this list will contain the set of original reference features that are matched with the generated candidate features. + +The second section contains features that were hallucinated by the Evaluator LLM. + +The third section contains Unmatched Reference Features, which include all the original features in the clinical trial that have no matching counterpart in the generated features. +![Eval1](images/Screenshot 2024-11-30 at 9.49.13AM.png) + +The fourth sections contains Unmatched Candidate Features, which include all generated features that have no corresponding matches in the set of original reference features. +![Eval2](images/Screenshot 2024-11-30 at 9.49.53AM.png) + +Lastly, the performance of the Evaluator LLM is represented by three metrices: Precision, Recall, and F1 Score. +![Eval 3](images/Screenshot 2024-11-30 at 9.50.07AM.png) + +For a final option, users can download the resulting evaluation categories and benchmarking metrics (True Matches, Hallucinations, Unmatched Generated Features, Unmatched Reference Features, Precision, Recall, and F1). +![Download Button](images/Screenshot 2024-12-08 at 5.35.38PM.png) +![Evaluator JSON](images/Screenshot 2024-12-08 at 5.34.08PM.png) + +## 4.5 Conclusions, Limitations, and Future Work. + +The implementation of LLM evaluation and benchmarking in an R Shiny app successfully demonstrated the ability to dynamically evaluate and compare model outputs in an interactive environment. By integrating real-time user input and reactive displays, the app provided a flexible platform for exploring the capabilities of LLMs in generating clinical trial metadata. The work highlighted the importance of adapting previously developed code for use in a dynamic interface, maintaining consistency with the original evaluation goals while offering a more user-friendly experience. + +However, significant limitations were encountered, primarily due to the volatile nature of LLM responses. LLM outputs can vary widely across different runs, even when using the same prompts and configurations. This inconsistency posed challenges for benchmarking, as reproducibility is a critical factor in evaluation. To mitigate this, multiple parsing and cleaning methods of generated responses were employed to obtain more robust results. Aggregating these outputs helped reduce variability, providing a more reliable basis for comparison. + +Future work should focus on improving abstraction and automation within the Shiny app. Currently, the app relies on distinct functions for various tasks, but creating an overarching object or framework to encapsulate the entire evaluation and benchmarking process could streamline future enhancements. Additionally, exploring ways to further stabilize LLM responses—potentially by fine-tuning the models or implementing advanced prompt engineering techniques—could enhance the reliability and consistency of the results. This would strengthen the app’s utility as a benchmarking tool for LLM performance in clinical trial evaluations + +# 5.0 Finding 3: Hallucination Mitigation and Prompt Engineering for Meta LLaMA and OpenAI GPT + +## 5.1 Data, Code, and Resources + +* The main dataframe used to test the prompt engineering was CT_Pub_updated.df, which can be shown below. The dataframe was only used for its metadata to construct prompts. + +```{r} +CT_Pub.df<- readRDS("../../CTBench_source/corrected_data/ct_pub/CT_Pub_data_updated.Rds") +summary(CT_Pub.df, 2) +``` +* The original evaluation prompt from the CTBench codebase can be found in the module.py file: [module.py link](https://github.com/nafis-neehal/CTBench_LLM/blob/main/module.py#L164) + +* Helper functions to fix output of Evaluator LLMs can be found here: [CTBench_LLM_prompt.Rmd link](https://github.rpi.edu/DataINCITE/DAR-CTEval-F24/blob/main/StudentNotebooks/Assignment08_FinalProjectNotebook/CTBench_LLM_promt.Rmd). + +* The R shiny app that employs techniques to remove hallucinations can be found here: [app.R link](https://github.rpi.edu/DataINCITE/DAR-CTBSApp-F24/blob/main/app.R) + +* Lastly the function "RemoveHallucinationsV2" can be found here: [functions.R](https://github.rpi.edu/DataINCITE/DAR-CTEval-F24/blob/main/StudentData/functions.R) + +## 5.2 Contribution + +The prompt engineering for this task was independently designed and developed by me. Hallucination mitigation techniques and tools were developed by Corey Curran. + +## 5.3 Methods Description + +The methods applied in this work involved a structured and iterative approach to prompt development. Initially, model behavior was analyzed by testing the initial prompts with sample inputs to evaluate the quality and structure of the outputs. This step provided insights into how effectively the prompts guided the models in producing accurate and coherent results. Following this analysis, the prompts were refined iteratively to address identified shortcomings, such as instances of invalid JSON outputs or incomplete feature matching. Finally, a cross-model comparison was conducted to ensure that the prompts maintained consistency in functionality and output format across different models, despite inherent differences in syntax and interpretive emphasis. + +For the functions used for removing hallucinations from the Evaluator LLMs output, I employed the R functions corey developed in R. Given that it had already been implemented my task was to simply make sure all passed parameters were in the correct format. + +## 5.4 Result and Discussion + +Through multiple iterations of testing a prompt for Meta-Llama-3.1-8B-Instruct was developed. Tests were run on both gpt and llama, prompts, however, it was only Llama that needed modification to its prompt to produce borderline acceptable results. + +The following are the two prompts for reference: + +```{r} + +systemPromptText_Evaluation.gpt <- "You are an expert assistant in the medical domain and clinical trial design. You are provided with details of a clinical trial. Your task is to determine which candidate baseline features match any feature in a reference baseline feature list for that trial. You need to consider the context and semantics while matching the features. + +For each candidate feature: + +1. Identify a matching reference feature based on similarity in context and semantics. +2. Remember the matched pair. +3. A reference feature can only be matched to one candidate feature and cannot be further considered for any consecutive matches. +4. If there are multiple possible matches (i.e. one reference feature can be matched to multiple candidate features or vice versa), choose the most contextually similar one. +5. Also keep track of which reference and candidate features remain unmatched. + +Once the matching is complete, provide the results in a JSON format as follows: +{ \"matched_features\": + [[ \"\", \"\"], + [ \"\", \"\"]], +\"remaining_reference_features\": + [\"\", \"\"], +\"remaining_candidate_features\": + [\"\", \"\"]} + +Don't give code, just return the result." + +systemPromptText_Evaluation.llama <- " + You are an expert assistant in the medical domain and clinical trial design. You are provided with details of a clinical trial. + Your task is to determine which candidate baseline features match any feature in a reference baseline feature list for that trial. + You need to consider the context and semantics while matching the features. + + For each candidate feature: + + 1. Identify a matching reference feature based on similarity in context and semantics. + 2. Remember the matched pair. + 3. A reference feature can only be matched to one candidate feature and cannot be further considered for any consecutive matches. + 4. If there are multiple possible matches (i.e. one reference feature can be matched to multiple candidate features or vice versa), choose the most contextually similar one. + 5. Also keep track of which reference and candidate features remain unmatched. + 6. DO NOT provide the code to accomplish this and ONLY respond with the following JSON. Perform the matching yourself. + Once the matching is complete, omitting explanations provide the answer only in the following form: + {\"matched_features\": [[\"\" , \"\" ],[\"\" , \"\"]],\"remaining_reference_features\": [\"\" ,\"\"],\"remaining_candidate_features\" : [\"\" ,\"\"]} + 7. Please generate a valid JSON object, ensuring it fits within a single JSON code block, with all keys and values properly quoted and all elements closed. Do not include line breaks within array elements." +``` + +Note the differences the prompts for Meta Llama compared to GPT. + +The first issues present in the Llama results was actual code that could be used to write a script to generate results, rather than the results itself. This was mitigated by instructions number 6 and 7, which repeatedly ask for valid JSON objects and specific instructions to not produce code. + +Lastly, one observation made regarding the llama model was that it was not able to discern the difference between whether to add a newline special character within the JSON, vs it just being there for formatting reasons. To mitigate this issue, all newlines near or within the JSON template provided in the prompt were removed. + +Note that no changes were made to the OpenAI GPT model, as its translation into R did not pose any issues. + +To combat the presence of hallucinations in the output of the evaluator, the main functions used was "RemoveHallucinations_v2". For reference this function and other related hallucination mitigation and removal functions can be found in: [functions.R](https://github.rpi.edu/DataINCITE/DAR-CTEval-F24/blob/main/StudentData/functions.R) + +```{r} +RemoveHallucinations_v2<-function(Matches,ReferenceList,CandidateList){ + # Matches should be a list containing the matches, with Matches[1] being from + # the reference list and Matches[2] being from the candidate list + # ReferenceList should be the true reference feature list + # CandidateList should be the true candidate feature list + # + # Currently, this extracts all true (non-hallucinated) matches, all addition + # match hallucinations (just the hallucinated feature, not the whole match), + # and all multi-match hallucinations (again, just the hallucinated feature), + # and calculates the corrected metrics. + + # count the number of times each feature appears in each list; useful for + # multi-match hallucination identification + Rtab<-as.data.frame(table(ReferenceList)) + Ctab<-as.data.frame(table(CandidateList)) + MRtab<-as.data.frame(table(Matches[,1])) + MCtab<-as.data.frame(table(Matches[,2])) + + # Extract the matches in which both the reference feature and candidate + # feature are real original features + TrueMatches<-Matches[(Matches[,1]%in%ReferenceList)& + (Matches[,2]%in%CandidateList),,drop=FALSE] + # Extract the addition hallucinations i.e. all the matched features which were + # not in the original lists + AHallucinations<-c(Matches[!(Matches[,1]%in%ReferenceList),1], + Matches[!(Matches[,2]%in%CandidateList),2]) + + # initialize empty vectors for the indices in which multi-match hallucinations + # occur... + Hindices<-c() + # ...and for the hallucinations themselves + MHallucinations<-c() + # loop through the rows of the matches + if (length(TrueMatches)>0){ + for (Riter in 1:nrow(TrueMatches)){ + feat<-TrueMatches[Riter,1] + if (MRtab$Freq[MRtab$Var1==feat]>Rtab$Freq[Rtab$ReferenceList==feat]){ + MRtab$Freq[MRtab$Var1==feat]=MRtab$Freq[MRtab$Var1==feat]-1 + MHallucinations<-c(MHallucinations,feat) + Hindices<-c(Hindices,Riter) + } + } + for (Citer in 1:nrow(TrueMatches)){ + feat<-TrueMatches[Riter,2] + if (MCtab$Freq[MCtab$Var1==feat]>Ctab$Freq[Ctab$CandidateList==feat]){ + MCtab$Freq[MCtab$Var1==feat]=MCtab$Freq[MCtab$Var1==feat]-1 + MHallucinations<-c(MHallucinations,feat) + Hindices<-c(Hindices,Citer) + } + } + if (length(Hindices)>0){ + TrueMatches<-TrueMatches[-Hindices,,drop=FALSE] + } + } + + Hallucinations<-c(AHallucinations,MHallucinations) + + precision<-max(nrow(TrueMatches),0,na.rm=TRUE)/length(CandidateList) + recall<-max(nrow(TrueMatches),0,na.rm=TRUE)/length(ReferenceList) + f1<-max(2*precision*recall/(precision+recall),0,na.rm=TRUE) + + UnmatchedReferenceFeature<-ReferenceList[!(ReferenceList%in%TrueMatches[,1])] + UnmatchedCandidateFeature<-CandidateList[!(CandidateList%in%TrueMatches[,2])] + + result<-list(TrueMatches=TrueMatches,Hallucinations=Hallucinations, + UnmatchedReferenceFeature=UnmatchedReferenceFeature, + UnmatchedCandidateFeature=UnmatchedCandidateFeature, + precision=precision,recall=recall,f1=f1) + + return(result) +} +``` + +In short this function takes in the generated candidate features (from the generation prompt), generated matches (from the evaluation prompt), and a list of the original reference features from the relevant clinical trial. Positive and Multi-match hallucinations are removed and a datastructure containing True Matches (with hallucinations removed), the positive and multi-match hallucinations, unmatched reference and candidate features, recall, precision, and f1 scores are returned. + +## 5.5 Conclusions, Limitations, and Future Work. + +The prompt engineering process successfully adapted the Meta LLaMA model to perform context-sensitive feature matching within clinical trial datasets. By iterative refinement, prompts achieved functional similarity, with adjustments tailored to each model’s specific quirks. The resulting outputs aligned closely with expected formats, particularly after addressing challenges such as invalid JSON and unnecessary code generation in the LLaMA model. + +Future work will focus on automating the prompt refinement process through feedback loops and extending the scope to include more complex feature matching scenarios. Additionally, efforts will aim to generalize the prompts for use with a broader range of models while preserving their precision and reliability, possibly even fine tuning for the complete removal of any hallucinations regardless of model. + +## 6.0 Overall Conclusions + +The translation of the CTEval codebase from Python to R significantly enhanced the handling of clinical trial evaluations within the R ecosystem. This effort retained the core functionality of the original Python implementation while paving the way for integration into R-specific workflows. However, R's limitations in managing certain complex operations highlighted the need for a more structured and abstracted framework. Addressing these organizational challenges will be crucial for scaling the implementation to handle larger datasets and support additional features effectively. + +The development of the R Shiny app for LLM evaluation and benchmarking further contributed to the ease and interactivity of clinical trial evaluations. The app enabled real-time comparison of LLM outputs, allowing users to explore and analyze results dynamically. Despite the utility of this tool, inconsistencies in LLM responses posed challenges that were mitigated by employing methods to clean and combine outputs, thereby improving result reliability. Future iterations of the app should aim to enhance consistency and incorporate more advanced evaluation metrics. + +The prompt engineering process successfully adapted the Meta LLaMA model to perform context-sensitive feature matching within clinical trial datasets. Iterative refinement allowed the prompts to address model-specific challenges, such as invalid JSON generation and unnecessary code output, particularly in the LLaMA model. Additionally, R’s inherent difficulties in handling certain prompt complexities were mitigated by simplifying instructions and introducing structured feedback mechanisms. These refinements ensured outputs aligned closely with the required format and functionality. Future work will focus on automating the refinement process, extending the scope to more complex feature matching scenarios, and generalizing prompts to support a wider range of models while maintaining precision and reliability. + + +# Bibliography +## Significant R packages used +* openai- Provides an interface to interact with OpenAI’s API for leveraging language models like GPT in R applications. +* RCurl - Facilitates making HTTP requests and handling data transfer over the web, useful for API calls and data retrieval. +* rlist - Offers tools for manipulating, querying, and transforming nested lists, ideal for handling complex data structures. +* shinyjs - Adds JavaScript capabilities to Shiny apps, enabling enhanced interactivity and dynamic UI behavior beyond standard Shiny functions. +* shiny - A framework for building interactive web applications directly in R, allowing data visualization and reactive user interfaces. + + + diff --git a/StudentNotebooks/Assignment08_FinalProjectNotebook/parks14_final_notebook.nb.html b/StudentNotebooks/Assignment08_FinalProjectNotebook/parks14_final_notebook.nb.html new file mode 100644 index 0000000..8ff179e --- /dev/null +++ b/StudentNotebooks/Assignment08_FinalProjectNotebook/parks14_final_notebook.nb.html @@ -0,0 +1,2445 @@ + + + + + + + + + + + + + + +Data Analytics Research CT-Eval Final Report + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + +
+

DAR Project and Group Members

+
    +
  • Project name: CT Eval
  • +
  • Project team members: Ziyi Baom, Corey Curran, Xiheng Liu, Tianhao +Zhao (Victor), Tianyan Lin, Mingyang Li, Samuel Park, Soumeek Mishra, +Yashas Balaji
  • +
+
+
+

0.0 Preliminaries.

+

There are no required packages required for this notebook. Linked +.Rmd and .R files in other github repositories will have their own +technical instructions and comments respectively.

+
+
+

1.0 Project Introduction

+

The project CTEval consists of analytical and technical methods, in +addition to a R shiny app, to analyze the ability of large language +models that are tasked with both the generation, evaluation, and +benchmarking of producing reference features of clinical trials (CT) +given CT metadata. Specifically we focused on prompt engineering to make +LLMs produce these reference features and benchmark these generated +features against the true values of known CTs. The evaluation team +focused on analyzing these result to find trends specific to certain +features, models, and other independent variables specific to this +project. Additionally, What this notebook will focus on is the specific +translation of the code originally in python to R and the +implemenetation of the evaluation and benchmarking of LLMs into an R +shiny app. For context the R shiny app is a web app built to provide a +platform for those in the CT domain to make use of a the aforementioned +features in a user-friendly manner.

+
+
+

2.0 Organization of Report

+

This report is organize as follows:

+
    +
  • Section 3.0. Translation of the CTEval code: The original +codebase is written in Python where all of the generation, evaluation, +and benchmarking of LLMs is performed. To allow for R code to perform +these same functionalities, I focused on translating the pertinent code +functions and files to R, such that these scripts can be run on R, in +effort to make the process of data generationa and analysis more +concise.

  • +
  • Section 4.0: Implementation of Evaluation and Benchmarking in R +shiny App: This section will cover the results of implementing the +evaluation and benchmarking features into the R shiny app.

  • +
  • Section 4.0: Hallucination Mitigation and Prompt engineering for +Llama: This section will cover the methods used to prompt engineer to +obtain viable results for the evaluation section, specifically from the +Llama LLM.

  • +
  • Section 6.0 Overall conclusions and suggestions

  • +
+
+
+

3.0 Finding 1: Translation of CTEval codebase from Python to R

+

The primary goal was to translate the CTEval codebase from Python to +R to enhance compatablity with existing R-based workflows. Some +questions driving this effor were, How effecitively can python logic and +constructs be translated to R? What modifications were necessary to +preserve performance and functionality?

+

The approach consisted of analysis of the python code to identify +core functionalities, including data manipualtion, feature matching, and +evaluation logic. By leveraging R specific libraries like dplyr and +purrr I was albe to mimic the pythonic logic.

+

The outcome resulted in preservation of the functionality of the +python code which were successfully replicated in R.

+
+

3.1 Data, Code, and Resources

+

The main datasets that were used are the CT_Pub_updated.df and +CT_Repo_updated.df dataframes. Specifics of each dataframe can be +analyzed in the linked .Rmd file.

+
    +
  1. CTBench_LLM_prompt.Rmd is the Rmd containing the R code with +markdown explanations on the data preparation, function translation, and +api calls of the translated code. ALL the code in this section can be +found in the following: CTBench_LLM_prompt.Rmd +link.
  2. +
+
+
+

3.2 Contribution

+

The code translated for this section is work done solely by me. +However, Soumeek Mishra, also worked on the translation of the the +codebase from Python to R, and his could should contain slight +differences in the ability to make API calls to a separate set of +LLMs.

+
+
+

3.3 Methods Description

+

The work I did for this section did not work with any anlytical +methods.

+

Methods that I did use for the translation included analyzing the +function and goals of the original python codebase, researching into +what libraries could provide similar support as structure used in python +(API calls, pandas data structure), and lastly using these to produce R +code that would be able to emulate the results of the original +codebase.

+
+
+

3.4 Result and Discussion

+

This section will contain the results of the major R functions +derived from translating the CTEval codebase to R. For reference the +source file can be found here: CTBench_LLM_prompt.Rmd +link

+

The following is a result of running the generation prompt on an LLM +for both single-shot and triple-shot generation. Specifically run on +trial NCT00395746 using the generation model, +Meta-Llama-3.1-8B-Instruct.

+
+Single Generation +
Single Generation
+
+

Batch Generation is the process of obtaining the generated results +for different trials in a “batch” or in bulk. Specifically, in such a +way such that all features are generated for the whole dataset of +trials. The inputs include the dataframe CT_Pub_updated.df or +CT_Repo_updated.df and the specific model used to generated the +candidate features.

+

When running batch generation on multiple CT trials, the resulting +metadata and generation result is stored in a dataframe with the +following format. Additional columns are added for evaluation and +benchmarking. In more detail, whereas, the original CTBench +representation contains the NCTid, generation model, and generated +candidate features, the dataframe I developed also contains the matching +model, length of matches, length of unmatched references, length of +unmatched candidate features, precision, recall, and f1 columns. This is +done to provide a common dataframe that contains the majority of +information trials which can be parsed to create more use-case friendly +dataframes.
Batch Generation1
+Batch Generation2
+Batch Generation3

+

This is the example of the output after running the evaluation prompt +on a single trial (using generated output from single generation, with +Meta-Llama-3.1-8B-Instruct as the matching LLM). The output consits of a +R dataframe object with the matched features:
+Single Evaluation 1

+

A second portion for the unmatched reference features:
+Single Evaluation 2

+

The last portion for the unmatched candidate features:
+Single Evaluation 3

+

Batch evaluation, similar to batch generation, runs the evaluation +algorithm on all trials contained in the pertinent dataframe.

+

When running batch evaluation the the length of the matches, +unmatched reference features, and unmatched candidate features are +stored in the aforementioned dataframe created when running batch +generation:
+Batch Evaluation

+

Lastly, one can perform benchmarking to retreive the associated +precision, recall, and f1 scores, for the generations of multiple +different trials’ generation and evaluation. Similar to both previous +examples of batch generation and evaluation, the benchmark metrics will +be derived for all trials in the pertinent dataframe, assuming that the +relevant information is available. Specifically, the columns of +len_matches, len_reference, and len_candidate are populated.
+Batch Benchmarking

+
+
+

3.5 Conclusions, Limitations, and Future Work.

+

The translation of the CTEval codebase from Python to R significantly +enhances its integration with existing R-based workflows in clinical +trial evaluations. While the core functionality of the Python code was +successfully preserved, the translation lacks sufficient abstraction, +which limits its scalability and ease of use. Future work should focus +on creating an object-oriented design that encapsulates all +functionalities, enabling better flexibility and extensibility. +Additionally, performance benchmarking and optimization should be +conducted, especially for handling large datasets.

+
+
+
+

4.0 Finding 2: Evaluation Functionality in CTBS App

+

The major finding was the successful integration of evaluation and +benchmarking features into an R Shiny app, enabling an interactive and +dynamic environment for analyzing data. The primary questions driving +this effort were: How effectively can R Shiny support real-time data +generation and benchmarking? How can evaluation metrics be seamlessly +incorporated into a user-friendly interface?

+

The approach involved designing a user-centric interface that could +display key evaluation metrics and benchmark results dynamically. Core +functionalities like data manipulation and metric calculation were +implemented using R Shiny’s reactive framework.

+

The outcome is a fully functional Shiny app that allows users to +evaluate and compare data efficiently. It provides real-time updates and +intuitive controls, enhancing both usability and analytical +capabilities.

+
+

4.1 Data, Code, and Resources

+

The main datasets that were used are the CT_Pub_updated.df and +CT_Repo_updated.df dataframes. Specifics of each dataframe can be +analyzed in the linked app.R file.

+
    +
  1. app.R is the main R file containing both the server logic and +frontend GUI code app.R +link. Note that all the output in the following sections can be +reproduced with the code in the provided link.
  2. +
+

The web application can be simply launched locally with the following +command:

+ + + +
shiny::runApp("app.R")
+ + + +
+
+

4.2 Contribution

+

This app was built in collaboration with Xiheng Liu and Tianyan Lin. +Both Xiheng and Tianyan contributed to the bulk of the web application +development, developing both the structure and implementations of LLM +generation.

+

Given the codebase structure developed by the two, I implemented the +LLM evaluation and benchmarking capabilities. Given that the codebase +has lots of moving parts, I worked closely in conjunction with both +Xiheng and Tianyan.

+
+
+

4.3 Methods Description

+

The development of the LLM evaluation and benchmarking features did +not rely on analytical methods but instead focused on adapting the +functionality to an R Shiny app environment. This process involved +careful consideration of how the interactive nature of Shiny apps could +be leveraged to enhance usability.

+

The approach began with analyzing the goals of the original +evaluation framework, particularly how user inputs and dynamic updates +would be handled. One of the key challenges was ensuring smooth data +flow between reactive elements, such as user-generated prompts and +real-time results. The intricacies of managing reactivity and state +within the Shiny app required precise structuring of the code to avoid +unintended updates or performance lags.

+

Much of the foundational work from Section 3 was reused to streamline +the implementation. Functions developed earlier were integrated to +handle core operations like data manipulation and evaluation logic, +allowing the focus to shift toward building a responsive, user-friendly +interface. This method ensured that the essential features of the +evaluation process were preserved while adapting them to the interactive +capabilities of the Shiny app.

+
+
+

4.4 Result and Discussion

+

This section will contain a step-by-step walkthrough of the +Evaluation and Benchmarking portion of the R Shiny app.

+

When first loading into the web application, there is a navigation +bar located on the top of the screen that contains an “Evaluate” tab +amongst the other functionalities included in the app.

+

On the home screen, there is a “Evaluate” Tab on the navigation bar +located on the top of the page. By clicking on the tab, one can navigate +the Evaluation portion of the app.
+Home Screen

+

On the evaluation page there are two sidebar tabs. The first is the +Options Tab which contains a Dropdown menu for the Evaluator LLM choice +which is set to “Meta-Llama-3.1-8B-Instruct”, a “Run Evaluation” button +to perform the evaluation and benchmarking, and lastly a textbox +containing the generated candidate features from the “Generate +Descriptors” portion of the app.
+Eval Page

+

The Evaluator LLM List contains all the supported LLMs that can be +used to evaluate the generated candidate descriptors against the +specified Clinical Trial’s original reference features. There are +multiple prompts used for different models to mitigate issues in the +evaluation stage.
+LLM List

+

When the “Run Evalation” button is clicked, multiple steps are +performed by the server logic of the application. First the metadata of +the trial selected in the “Specify Trial” section of the app is parsed. +With this parsed information multiple different server and user prompts +are constructed depending on the Evaluating LLM. Next an API call is +made to the specified LLM. Once the output is received, multiple error +checks and helper functions are used to either fix up any invalid JSON +outputs or retry the evaluation. These results are then stored in +“Reactive Vals” which are dynamic types provided by the shiny +library.

+

Once the evaluation completes running, we can navigate to the “Report +Page”. This page includes, multiple different results and metrics.

+

The first of which are the True Matches, this list will contain the +set of original reference features that are matched with the generated +candidate features.

+

The second section contains features that were hallucinated by the +Evaluator LLM.

+

The third section contains Unmatched Reference Features, which +include all the original features in the clinical trial that have no +matching counterpart in the generated features.
+Eval1

+

The fourth sections contains Unmatched Candidate Features, which +include all generated features that have no corresponding matches in the +set of original reference features.
+Eval2

+

Lastly, the performance of the Evaluator LLM is represented by three +metrices: Precision, Recall, and F1 Score.
+Eval 3

+

For a final option, users can download the resulting evaluation +categories and benchmarking metrics (True Matches, Hallucinations, +Unmatched Generated Features, Unmatched Reference Features, Precision, +Recall, and F1).
+Download Button
+Evaluator JSON

+
+
+

4.5 Conclusions, Limitations, and Future Work.

+

The implementation of LLM evaluation and benchmarking in an R Shiny +app successfully demonstrated the ability to dynamically evaluate and +compare model outputs in an interactive environment. By integrating +real-time user input and reactive displays, the app provided a flexible +platform for exploring the capabilities of LLMs in generating clinical +trial metadata. The work highlighted the importance of adapting +previously developed code for use in a dynamic interface, maintaining +consistency with the original evaluation goals while offering a more +user-friendly experience.

+

However, significant limitations were encountered, primarily due to +the volatile nature of LLM responses. LLM outputs can vary widely across +different runs, even when using the same prompts and configurations. +This inconsistency posed challenges for benchmarking, as reproducibility +is a critical factor in evaluation. To mitigate this, multiple parsing +and cleaning methods of generated responses were employed to obtain more +robust results. Aggregating these outputs helped reduce variability, +providing a more reliable basis for comparison.

+

Future work should focus on improving abstraction and automation +within the Shiny app. Currently, the app relies on distinct functions +for various tasks, but creating an overarching object or framework to +encapsulate the entire evaluation and benchmarking process could +streamline future enhancements. Additionally, exploring ways to further +stabilize LLM responses—potentially by fine-tuning the models or +implementing advanced prompt engineering techniques—could enhance the +reliability and consistency of the results. This would strengthen the +app’s utility as a benchmarking tool for LLM performance in clinical +trial evaluations

+
+
+
+

5.0 Finding 3: Hallucination Mitigation and Prompt Engineering for +Meta LLaMA and OpenAI GPT

+
+

5.1 Data, Code, and Resources

+
    +
  • The main dataframe used to test the prompt engineering was +CT_Pub_updated.df, which can be shown below. The dataframe was only used +for its metadata to construct prompts.
  • +
+ + + +
CT_Pub.df<- readRDS("../../CTBench_source/corrected_data/ct_pub/CT_Pub_data_updated.Rds")
+summary(CT_Pub.df, 2)
+ + +
    NCTId            BriefTitle        EligibilityCriteria BriefSummary        Conditions        Interventions      PrimaryOutcomes   
+ Length:103         Length:103         Length:103          Length:103         Length:103         Length:103         Length:103        
+ Class :character   Class :character   Class :character    Class :character   Class :character   Class :character   Class :character  
+ Mode  :character   Mode  :character   Mode  :character    Mode  :character   Mode  :character   Mode  :character   Mode  :character  
+ API_BaselineMeasures API_BaselineMeasures_Corrected Paper_BaselineMeasures Paper_BaselineMeasures_Corrected
+ Length:103           Length:103                     Length:103             Length:103                      
+ Class :character     Class :character               Class :character       Class :character                
+ Mode  :character     Mode  :character               Mode  :character       Mode  :character                
+ + + +
    +
  • The original evaluation prompt from the CTBench codebase can be +found in the module.py file: module.py +link

  • +
  • Helper functions to fix output of Evaluator LLMs can be found +here: CTBench_LLM_prompt.Rmd +link.

  • +
  • The R shiny app that employs techniques to remove hallucinations +can be found here: app.R +link

  • +
  • Lastly the function “RemoveHallucinationsV2” can be found here: +functions.R

  • +
+
+
+

5.2 Contribution

+

The prompt engineering for this task was independently designed and +developed by me. Hallucination mitigation techniques and tools were +developed by Corey Curran.

+
+
+

5.3 Methods Description

+

The methods applied in this work involved a structured and iterative +approach to prompt development. Initially, model behavior was analyzed +by testing the initial prompts with sample inputs to evaluate the +quality and structure of the outputs. This step provided insights into +how effectively the prompts guided the models in producing accurate and +coherent results. Following this analysis, the prompts were refined +iteratively to address identified shortcomings, such as instances of +invalid JSON outputs or incomplete feature matching. Finally, a +cross-model comparison was conducted to ensure that the prompts +maintained consistency in functionality and output format across +different models, despite inherent differences in syntax and +interpretive emphasis.

+

For the functions used for removing hallucinations from the Evaluator +LLMs output, I employed the R functions corey developed in R. Given that +it had already been implemented my task was to simply make sure all +passed parameters were in the correct format.

+
+
+

5.4 Result and Discussion

+

Through multiple iterations of testing a prompt for +Meta-Llama-3.1-8B-Instruct was developed. Tests were run on both gpt and +llama, prompts, however, it was only Llama that needed modification to +its prompt to produce borderline acceptable results.

+

The following are the two prompts for reference:

+ + + +

+systemPromptText_Evaluation.gpt <- "You are an expert assistant in the medical domain and clinical trial design. You are provided with details of a clinical trial. Your task is to determine which candidate baseline features match any feature in a reference baseline feature list for that trial. You need to consider the context and semantics while matching the features.
+
+For each candidate feature:
+
+1. Identify a matching reference feature based on similarity in context and semantics.
+2. Remember the matched pair.
+3. A reference feature can only be matched to one candidate feature and cannot be further considered for any consecutive matches.
+4. If there are multiple possible matches (i.e. one reference feature can be matched to multiple candidate features or vice versa), choose the most contextually similar one.
+5. Also keep track of which reference and candidate features remain unmatched.
+
+Once the matching is complete, provide the results in a JSON format as follows:
+{ \"matched_features\":
+    [[ \"<reference feature 1>\", \"<candidate feature 1>\"],
+     [ \"<reference feature 2>\", \"<candidate feature 2>\"]],
+\"remaining_reference_features\":
+    [\"<unmatched reference feature 1>\", \"<unmatched reference feature 2>\"],
+\"remaining_candidate_features\":
+    [\"<unmatched candidate feature 1>\", \"<unmatched candidate feature 2>\"]}
+
+Don't give code, just return the result."
+
+systemPromptText_Evaluation.llama <- "
+    You are an expert assistant in the medical domain and clinical trial design. You are provided with details of a clinical trial.
+    Your task is to determine which candidate baseline features match any feature in a reference baseline feature list for that trial. 
+    You need to consider the context and semantics while matching the features.
+
+    For each candidate feature:
+    
+        1. Identify a matching reference feature based on similarity in context and semantics.
+        2. Remember the matched pair.
+        3. A reference feature can only be matched to one candidate feature and cannot be further considered for any consecutive matches.
+        4. If there are multiple possible matches (i.e. one reference feature can be matched to multiple candidate features or vice versa), choose the most contextually similar one.
+        5. Also keep track of which reference and candidate features remain unmatched.
+    6. DO NOT provide the code to accomplish this and ONLY respond with the following JSON. Perform the matching yourself.
+    Once the matching is complete, omitting explanations provide the answer only in the following form:
+  {\"matched_features\": [[\"<reference feature 1>\" , \"<candidate feature 1>\" ],[\"<reference feature 2>\" , \"<candidate feature 2>\"]],\"remaining_reference_features\": [\"<unmatched reference feature 1>\" ,\"<unmatched reference feature 2>\"],\"remaining_candidate_features\" : [\"<unmatched candidate feature 1>\" ,\"<unmatched candidate feature 2>\"]}
+  7. Please generate a valid JSON object, ensuring it fits within a single JSON code block, with all keys and values properly quoted and all elements closed. Do not include line breaks within array elements."
+ + + +

Note the differences the prompts for Meta Llama compared to GPT.

+

The first issues present in the Llama results was actual code that +could be used to write a script to generate results, rather than the +results itself. This was mitigated by instructions number 6 and 7, which +repeatedly ask for valid JSON objects and specific instructions to not +produce code.

+

Lastly, one observation made regarding the llama model was that it +was not able to discern the difference between whether to add a newline +special character within the JSON, vs it just being there for formatting +reasons. To mitigate this issue, all newlines near or within the JSON +template provided in the prompt were removed.

+

Note that no changes were made to the OpenAI GPT model, as its +translation into R did not pose any issues.

+

To combat the presence of hallucinations in the output of the +evaluator, the main functions used was “RemoveHallucinations_v2”. For +reference this function and other related hallucination mitigation and +removal functions can be found in: functions.R

+ + + +
RemoveHallucinations_v2<-function(Matches,ReferenceList,CandidateList){
+  # Matches should be a list containing the matches, with Matches[1] being from
+  # the reference list and Matches[2] being from the candidate list
+  # ReferenceList should be the true reference feature list
+  # CandidateList should be the true candidate feature list
+  # 
+  # Currently, this extracts all true (non-hallucinated) matches, all addition
+  # match hallucinations (just the hallucinated feature, not the whole match), 
+  # and all multi-match hallucinations (again, just the hallucinated feature),
+  # and calculates the corrected metrics.
+  
+  # count the number of times each feature appears in each list; useful for
+  # multi-match hallucination identification
+  Rtab<-as.data.frame(table(ReferenceList))
+  Ctab<-as.data.frame(table(CandidateList))
+  MRtab<-as.data.frame(table(Matches[,1]))
+  MCtab<-as.data.frame(table(Matches[,2]))
+  
+  # Extract the matches in which both the reference feature and candidate 
+  # feature are real original features
+  TrueMatches<-Matches[(Matches[,1]%in%ReferenceList)&
+                         (Matches[,2]%in%CandidateList),,drop=FALSE]
+  # Extract the addition hallucinations i.e. all the matched features which were
+  # not in the original lists
+  AHallucinations<-c(Matches[!(Matches[,1]%in%ReferenceList),1],
+                     Matches[!(Matches[,2]%in%CandidateList),2])
+
+  # initialize empty vectors for the indices in which multi-match hallucinations
+  # occur...
+  Hindices<-c()
+  # ...and for the hallucinations themselves
+  MHallucinations<-c()
+  # loop through the rows of the matches
+  if (length(TrueMatches)>0){
+    for (Riter in 1:nrow(TrueMatches)){
+      feat<-TrueMatches[Riter,1]
+      if (MRtab$Freq[MRtab$Var1==feat]>Rtab$Freq[Rtab$ReferenceList==feat]){
+        MRtab$Freq[MRtab$Var1==feat]=MRtab$Freq[MRtab$Var1==feat]-1
+        MHallucinations<-c(MHallucinations,feat)
+        Hindices<-c(Hindices,Riter)
+      }
+    }
+    for (Citer in 1:nrow(TrueMatches)){
+      feat<-TrueMatches[Riter,2]
+      if (MCtab$Freq[MCtab$Var1==feat]>Ctab$Freq[Ctab$CandidateList==feat]){
+        MCtab$Freq[MCtab$Var1==feat]=MCtab$Freq[MCtab$Var1==feat]-1
+        MHallucinations<-c(MHallucinations,feat)
+        Hindices<-c(Hindices,Citer)
+      }
+    }
+    if (length(Hindices)>0){
+      TrueMatches<-TrueMatches[-Hindices,,drop=FALSE]
+    }
+  }
+  
+  Hallucinations<-c(AHallucinations,MHallucinations)
+  
+  precision<-max(nrow(TrueMatches),0,na.rm=TRUE)/length(CandidateList)
+  recall<-max(nrow(TrueMatches),0,na.rm=TRUE)/length(ReferenceList)
+  f1<-max(2*precision*recall/(precision+recall),0,na.rm=TRUE)
+  
+  UnmatchedReferenceFeature<-ReferenceList[!(ReferenceList%in%TrueMatches[,1])]
+  UnmatchedCandidateFeature<-CandidateList[!(CandidateList%in%TrueMatches[,2])]
+  
+  result<-list(TrueMatches=TrueMatches,Hallucinations=Hallucinations,
+               UnmatchedReferenceFeature=UnmatchedReferenceFeature,
+               UnmatchedCandidateFeature=UnmatchedCandidateFeature,
+               precision=precision,recall=recall,f1=f1)
+  
+  return(result)
+}
+ + + +

In short this function takes in the generated candidate features +(from the generation prompt), generated matches (from the evaluation +prompt), and a list of the original reference features from the relevant +clinical trial. Positive and Multi-match hallucinations are removed and +a datastructure containing True Matches (with hallucinations removed), +the positive and multi-match hallucinations, unmatched reference and +candidate features, recall, precision, and f1 scores are returned.

+
+
+

5.5 Conclusions, Limitations, and Future Work.

+

The prompt engineering process successfully adapted the Meta LLaMA +model to perform context-sensitive feature matching within clinical +trial datasets. By iterative refinement, prompts achieved functional +similarity, with adjustments tailored to each model’s specific quirks. +The resulting outputs aligned closely with expected formats, +particularly after addressing challenges such as invalid JSON and +unnecessary code generation in the LLaMA model.

+

Future work will focus on automating the prompt refinement process +through feedback loops and extending the scope to include more complex +feature matching scenarios. Additionally, efforts will aim to generalize +the prompts for use with a broader range of models while preserving +their precision and reliability, possibly even fine tuning for the +complete removal of any hallucinations regardless of model.

+
+
+

6.0 Overall Conclusions

+

The translation of the CTEval codebase from Python to R significantly +enhanced the handling of clinical trial evaluations within the R +ecosystem. This effort retained the core functionality of the original +Python implementation while paving the way for integration into +R-specific workflows. However, R’s limitations in managing certain +complex operations highlighted the need for a more structured and +abstracted framework. Addressing these organizational challenges will be +crucial for scaling the implementation to handle larger datasets and +support additional features effectively.

+

The development of the R Shiny app for LLM evaluation and +benchmarking further contributed to the ease and interactivity of +clinical trial evaluations. The app enabled real-time comparison of LLM +outputs, allowing users to explore and analyze results dynamically. +Despite the utility of this tool, inconsistencies in LLM responses posed +challenges that were mitigated by employing methods to clean and combine +outputs, thereby improving result reliability. Future iterations of the +app should aim to enhance consistency and incorporate more advanced +evaluation metrics.

+

The prompt engineering process successfully adapted the Meta LLaMA +model to perform context-sensitive feature matching within clinical +trial datasets. Iterative refinement allowed the prompts to address +model-specific challenges, such as invalid JSON generation and +unnecessary code output, particularly in the LLaMA model. Additionally, +R’s inherent difficulties in handling certain prompt complexities were +mitigated by simplifying instructions and introducing structured +feedback mechanisms. These refinements ensured outputs aligned closely +with the required format and functionality. Future work will focus on +automating the refinement process, extending the scope to more complex +feature matching scenarios, and generalizing prompts to support a wider +range of models while maintaining precision and reliability.

+
+
+
+

Bibliography

+
+

Significant R packages used

+
    +
  • openai- Provides an interface to interact with OpenAI’s API for +leveraging language models like GPT in R applications.
  • +
  • RCurl - Facilitates making HTTP requests and handling data transfer +over the web, useful for API calls and data retrieval.
  • +
  • rlist - Offers tools for manipulating, querying, and transforming +nested lists, ideal for handling complex data structures.
  • +
  • shinyjs - Adds JavaScript capabilities to Shiny apps, enabling +enhanced interactivity and dynamic UI behavior beyond standard Shiny +functions.
  • +
  • shiny - A framework for building interactive web applications +directly in R, allowing data visualization and reactive user +interfaces.
  • +
+ +
+
+ +
---
title: "Data Analytics Research CT-Eval Final Report"
author: "Samuel Park"
date: "Fall 2024"
output:
  pdf_document:
    toc: yes
    toc_depth: '3'
  html_notebook: default
  html_document:
    toc: yes
    toc_depth: 3
    toc_float: yes
    number_sections: yes
    theme: united
---
# DAR Project and Group Members

* Project name: CT Eval 
* Project team members: Ziyi Baom, Corey Curran, Xiheng Liu, Tianhao Zhao (Victor), Tianyan Lin, Mingyang Li, Samuel Park, Soumeek Mishra, Yashas Balaji

# 0.0 Preliminaries.

There are no required packages required for this notebook. Linked .Rmd and .R files in other github repositories will have their own technical instructions and comments respectively.

# 1.0 Project Introduction

The project CTEval consists of analytical and technical methods, in addition to a R shiny app, to analyze the ability of large language models that are tasked with both the generation, evaluation, and benchmarking of producing reference features of clinical trials (CT) given CT metadata. Specifically we focused on prompt engineering to make LLMs produce these reference features and benchmark these generated features against the true values of known CTs. The evaluation team focused on analyzing these result to find trends specific to certain features, models, and other independent variables specific to this project. Additionally, What this notebook will focus on is the specific translation of the code originally in python to R and the implemenetation of the evaluation and benchmarking of LLMs into an R shiny app. For context the R shiny app is a web app built to provide a platform for those in the CT domain to make use of a the aforementioned features in a user-friendly manner.

# 2.0 Organization of Report

This report is organize as follows: 

* Section 3.0.  Translation of the CTEval code: The original codebase is written in Python where all of the generation, evaluation, and benchmarking of LLMs is performed. To allow for R code to perform these same functionalities, I focused on translating the pertinent code functions and files to R, such that these scripts can be run on R, in effort to make the process of data generationa and analysis more concise.

* Section 4.0: Implementation of Evaluation and Benchmarking in R shiny App: This section will cover the results of implementing the evaluation and benchmarking features into the R shiny app. 
 
* Section 4.0: Hallucination Mitigation and Prompt engineering for Llama: This section will cover the methods used to prompt engineer to obtain viable results for the evaluation section, specifically from the Llama LLM.

* Section 6.0 Overall conclusions and suggestions 



# 3.0 Finding 1: Translation of CTEval codebase from Python to R 

The primary goal was to translate the CTEval codebase from Python to R to enhance compatablity with existing R-based workflows. Some questions driving this effor were, How effecitively can python logic and constructs be translated to R? What modifications were necessary to preserve performance and functionality?

The approach consisted of analysis of the python code to identify core functionalities, including data manipualtion, feature matching, and evaluation logic. By leveraging R specific libraries like dplyr and purrr I was albe to mimic the pythonic logic.

The outcome resulted in preservation of the functionality of the python code which were successfully replicated in R.


## 3.1 Data, Code, and Resources

The main datasets that were used are the CT_Pub_updated.df and CT_Repo_updated.df dataframes. Specifics of each dataframe can be analyzed in the linked .Rmd file.

1. CTBench_LLM_prompt.Rmd is the Rmd containing the R code with markdown explanations on the data preparation, function translation, and api calls of the translated code. ALL the code in this section can be found in the following: 
[CTBench_LLM_prompt.Rmd link](https://github.rpi.edu/DataINCITE/DAR-CTEval-F24/blob/main/StudentNotebooks/Assignment08_FinalProjectNotebook/CTBench_LLM_promt.Rmd). 


## 3.2 Contribution

The code translated for this section is work done solely by me. However, Soumeek Mishra, also worked on the translation of the the codebase from Python to R, and his could should contain slight differences in the ability to make API calls to a separate set of LLMs.


## 3.3 Methods Description 

The work I did for this section did not work with any anlytical methods.

Methods that I did use for the translation included analyzing the function and goals of the original python codebase, researching into what libraries could provide similar support as structure used in python (API calls, pandas data structure), and lastly using these to produce R code that would be able to emulate the results of the original codebase.

## 3.4 Result and Discussion

This section will contain the results of the major R functions derived from translating the CTEval codebase to R. For reference the source file can be found here: [CTBench_LLM_prompt.Rmd link](https://github.rpi.edu/DataINCITE/DAR-CTEval-F24/blob/main/StudentNotebooks/Assignment08_FinalProjectNotebook/CTBench_LLM_promt.Rmd)

The following is a result of running the generation prompt on an LLM for both single-shot and triple-shot generation. Specifically run on trial NCT00395746 using the generation model, Meta-Llama-3.1-8B-Instruct.

![Single Generation](images/Screenshot_2024-11-29_at_7.23.50PM.png)

Batch Generation is the process of obtaining the generated results for different trials in a "batch" or in bulk. Specifically, in such a way such that all features are generated for the whole dataset of trials. The inputs include the dataframe CT_Pub_updated.df or CT_Repo_updated.df and the specific model used to generated the candidate features.

When running batch generation on multiple CT trials, the resulting metadata and generation result is stored in a dataframe with the following format. Additional columns are added for evaluation and benchmarking. In more detail, whereas, the original CTBench representation contains the NCTid, generation model, and generated candidate features, the dataframe I developed also contains the matching model, length of matches, length of unmatched references, length of unmatched candidate features, precision, recall, and f1 columns. This is done to provide a common dataframe that contains the majority of information trials which can be parsed to create more use-case friendly dataframes.<br>
![Batch Generation1](images/Screenshot_2024-11-29_at_7.24.00PM.png)  
![Batch Generation2](images/Screenshot_2024-11-29_at_7.24.07PM.png)  
![Batch Generation3](images/Screenshot_2024-11-29_at_7.24.16PM.png)  

This is the example of the output after running the evaluation prompt on a single trial (using generated output from single generation, with Meta-Llama-3.1-8B-Instruct as the matching LLM).
The output consits of a R dataframe object with the matched features:  
![Single Evaluation 1](images/Screenshot_2024-11-29_at_7.24.32PM.png)

A second portion for the unmatched reference features:  
![Single Evaluation 2](images/Screenshot_2024-11-29_at_7.24.39PM.png)

The last portion for the unmatched candidate features:  
![Single Evaluation 3](images/Screenshot_2024-11-29_at_7.24.50PM.png)

Batch evaluation, similar to batch generation, runs the evaluation algorithm on all trials contained in the pertinent dataframe.

When running batch evaluation the the length of the matches, unmatched reference features, and unmatched candidate features are stored in the aforementioned dataframe created when running batch generation:  
![Batch Evaluation](images/Screenshot_2024-11-29_at_7.25.08PM.png)

Lastly, one can perform benchmarking to retreive the associated precision, recall, and f1 scores, for the generations of multiple different trials' generation and evaluation. Similar to both previous examples of batch generation and evaluation, the benchmark metrics will be derived for all trials in the pertinent dataframe, assuming that the relevant information is available. Specifically, the columns of len_matches, len_reference, and len_candidate are populated.  
![Batch Benchmarking](images/Screenshot_2024-11-29_at_7.25.32PM.png)



## 3.5 Conclusions, Limitations,  and Future Work.

The translation of the CTEval codebase from Python to R significantly enhances its integration with existing R-based workflows in clinical trial evaluations. While the core functionality of the Python code was successfully preserved, the translation lacks sufficient abstraction, which limits its scalability and ease of use. Future work should focus on creating an object-oriented design that encapsulates all functionalities, enabling better flexibility and extensibility. Additionally, performance benchmarking and optimization should be conducted, especially for handling large datasets.



# 4.0 Finding 2: Evaluation Functionality in CTBS App 

The major finding was the successful integration of evaluation and benchmarking features into an R Shiny app, enabling an interactive and dynamic environment for analyzing data. The primary questions driving this effort were: How effectively can R Shiny support real-time data generation and benchmarking? How can evaluation metrics be seamlessly incorporated into a user-friendly interface?

The approach involved designing a user-centric interface that could display key evaluation metrics and benchmark results dynamically. Core functionalities like data manipulation and metric calculation were implemented using R Shiny’s reactive framework.

The outcome is a fully functional Shiny app that allows users to evaluate and compare data efficiently. It provides real-time updates and intuitive controls, enhancing both usability and analytical capabilities.

## 4.1 Data, Code, and Resources

The main datasets that were used are the CT_Pub_updated.df and CT_Repo_updated.df dataframes. Specifics of each dataframe can be analyzed in the linked app.R file.

1. app.R is the main R file containing both the server logic and frontend GUI code [app.R link](https://github.rpi.edu/DataINCITE/DAR-CTBSApp-F24/blob/main/app.R). Note that all the output in the following sections can be reproduced with the code in the provided link.

The web application can be simply launched locally with the following command:
```{r, eval=FALSE}
shiny::runApp("app.R")
```

## 4.2 Contribution

This app was built in collaboration with Xiheng Liu and Tianyan Lin. Both Xiheng and Tianyan contributed to the bulk of the web application development, developing both the structure and implementations of LLM generation.

Given the codebase structure developed by the two, I implemented the LLM evaluation and benchmarking capabilities. Given that the codebase has lots of moving parts, I worked closely in conjunction with both Xiheng and Tianyan.

## 4.3 Methods Description 

The development of the LLM evaluation and benchmarking features did not rely on analytical methods but instead focused on adapting the functionality to an R Shiny app environment. This process involved careful consideration of how the interactive nature of Shiny apps could be leveraged to enhance usability.

The approach began with analyzing the goals of the original evaluation framework, particularly how user inputs and dynamic updates would be handled. One of the key challenges was ensuring smooth data flow between reactive elements, such as user-generated prompts and real-time results. The intricacies of managing reactivity and state within the Shiny app required precise structuring of the code to avoid unintended updates or performance lags.

Much of the foundational work from Section 3 was reused to streamline the implementation. Functions developed earlier were integrated to handle core operations like data manipulation and evaluation logic, allowing the focus to shift toward building a responsive, user-friendly interface. This method ensured that the essential features of the evaluation process were preserved while adapting them to the interactive capabilities of the Shiny app.



## 4.4 Result and Discussion 
This section will contain a step-by-step walkthrough of the Evaluation and Benchmarking portion of the R Shiny app.

When first loading into the web application, there is a navigation bar located on the top of the screen that contains an "Evaluate" tab amongst the other functionalities included in the app.


On the home screen, there is a "Evaluate" Tab on the navigation bar located on the top of the page. By clicking on the tab, one can navigate the Evaluation portion of the app.  
![Home Screen](images/Screenshot 2024-11-30 at 9.46.48AM.png)

On the evaluation page there are two sidebar tabs. The first is the Options Tab which contains a Dropdown menu for the Evaluator LLM choice which is set to "Meta-Llama-3.1-8B-Instruct", a "Run Evaluation" button to perform the evaluation and benchmarking, and lastly a textbox containing the generated candidate features from the "Generate Descriptors" portion of the app.  
![Eval Page](images/Screenshot 2024-11-30 at 9.48.16AM.png)

The Evaluator LLM List contains all the supported LLMs that can be used to evaluate the generated candidate descriptors against the specified Clinical Trial's original reference features. There are multiple prompts used for different models to mitigate issues in the evaluation stage.  
![LLM List](images/Screenshot 2024-11-30 at 9.48.36AM.png)

When the "Run Evalation" button is clicked, multiple steps are performed by the server logic of the application. First the metadata of the trial selected in the "Specify Trial" section of the app is parsed. With this parsed information multiple different server and user prompts are constructed depending on the Evaluating LLM. Next an API call is made to the specified LLM. Once the output is received, multiple error checks and helper functions are used to either fix up any invalid JSON outputs or retry the evaluation. These results are then stored in "Reactive Vals" which are dynamic types provided by the shiny library.

Once the evaluation completes running, we can navigate to the "Report Page". This page includes, multiple different results and metrics.

The first of which are the True Matches, this list will contain the set of original reference features that are matched with the generated candidate features.

The second section contains features that were hallucinated by the Evaluator LLM.

The third section contains Unmatched Reference Features, which include all the original features in the clinical trial that have no matching counterpart in the generated features.  
![Eval1](images/Screenshot 2024-11-30 at 9.49.13AM.png)

The fourth sections contains Unmatched Candidate Features, which include all generated features that have no corresponding matches in the set of original reference features.  
![Eval2](images/Screenshot 2024-11-30 at 9.49.53AM.png)

Lastly, the performance of the Evaluator LLM is represented by three metrices: Precision, Recall, and F1 Score.  
![Eval 3](images/Screenshot 2024-11-30 at 9.50.07AM.png)

For a final option, users can download the resulting evaluation categories and benchmarking metrics (True Matches, Hallucinations, Unmatched Generated Features, Unmatched Reference Features, Precision, Recall, and F1).  
![Download Button](images/Screenshot 2024-12-08 at 5.35.38PM.png)  
![Evaluator JSON](images/Screenshot 2024-12-08 at 5.34.08PM.png)

## 4.5 Conclusions, Limitations, and Future Work.

The implementation of LLM evaluation and benchmarking in an R Shiny app successfully demonstrated the ability to dynamically evaluate and compare model outputs in an interactive environment. By integrating real-time user input and reactive displays, the app provided a flexible platform for exploring the capabilities of LLMs in generating clinical trial metadata. The work highlighted the importance of adapting previously developed code for use in a dynamic interface, maintaining consistency with the original evaluation goals while offering a more user-friendly experience.

However, significant limitations were encountered, primarily due to the volatile nature of LLM responses. LLM outputs can vary widely across different runs, even when using the same prompts and configurations. This inconsistency posed challenges for benchmarking, as reproducibility is a critical factor in evaluation. To mitigate this, multiple parsing and cleaning methods of generated responses were employed to obtain more robust results. Aggregating these outputs helped reduce variability, providing a more reliable basis for comparison.

Future work should focus on improving abstraction and automation within the Shiny app. Currently, the app relies on distinct functions for various tasks, but creating an overarching object or framework to encapsulate the entire evaluation and benchmarking process could streamline future enhancements. Additionally, exploring ways to further stabilize LLM responses—potentially by fine-tuning the models or implementing advanced prompt engineering techniques—could enhance the reliability and consistency of the results. This would strengthen the app’s utility as a benchmarking tool for LLM performance in clinical trial evaluations

# 5.0 Finding 3: Hallucination Mitigation and Prompt Engineering for Meta LLaMA and OpenAI GPT

## 5.1 Data, Code, and Resources

* The main dataframe used to test the prompt engineering was CT_Pub_updated.df, which can be shown below. The dataframe was only used for its metadata to construct prompts.

```{r}
CT_Pub.df<- readRDS("../../CTBench_source/corrected_data/ct_pub/CT_Pub_data_updated.Rds")
summary(CT_Pub.df, 2)
```
* The original evaluation prompt from the CTBench codebase can be found in the module.py file: [module.py link](https://github.com/nafis-neehal/CTBench_LLM/blob/main/module.py#L164)

* Helper functions to fix output of Evaluator LLMs can be found here: [CTBench_LLM_prompt.Rmd link](https://github.rpi.edu/DataINCITE/DAR-CTEval-F24/blob/main/StudentNotebooks/Assignment08_FinalProjectNotebook/CTBench_LLM_promt.Rmd). 

* The R shiny app that employs techniques to remove hallucinations can be found here: [app.R link](https://github.rpi.edu/DataINCITE/DAR-CTBSApp-F24/blob/main/app.R)

* Lastly the function "RemoveHallucinationsV2" can be found here: [functions.R](https://github.rpi.edu/DataINCITE/DAR-CTEval-F24/blob/main/StudentData/functions.R)

## 5.2 Contribution

The prompt engineering for this task was independently designed and developed by me. Hallucination mitigation techniques and tools were developed by Corey Curran.

## 5.3 Methods Description

The methods applied in this work involved a structured and iterative approach to prompt development. Initially, model behavior was analyzed by testing the initial prompts with sample inputs to evaluate the quality and structure of the outputs. This step provided insights into how effectively the prompts guided the models in producing accurate and coherent results. Following this analysis, the prompts were refined iteratively to address identified shortcomings, such as instances of invalid JSON outputs or incomplete feature matching. Finally, a cross-model comparison was conducted to ensure that the prompts maintained consistency in functionality and output format across different models, despite inherent differences in syntax and interpretive emphasis.

For the functions used for removing hallucinations from the Evaluator LLMs output, I employed the R functions corey developed in R. Given that it had already been implemented my task was to simply make sure all passed parameters were in the correct format.

## 5.4 Result and Discussion

Through multiple iterations of testing a prompt for Meta-Llama-3.1-8B-Instruct was developed. Tests were run on both gpt and llama, prompts, however, it was only Llama that needed modification to its prompt to produce borderline acceptable results.

The following are the two prompts for reference:

```{r}

systemPromptText_Evaluation.gpt <- "You are an expert assistant in the medical domain and clinical trial design. You are provided with details of a clinical trial. Your task is to determine which candidate baseline features match any feature in a reference baseline feature list for that trial. You need to consider the context and semantics while matching the features.

For each candidate feature:

1. Identify a matching reference feature based on similarity in context and semantics.
2. Remember the matched pair.
3. A reference feature can only be matched to one candidate feature and cannot be further considered for any consecutive matches.
4. If there are multiple possible matches (i.e. one reference feature can be matched to multiple candidate features or vice versa), choose the most contextually similar one.
5. Also keep track of which reference and candidate features remain unmatched.

Once the matching is complete, provide the results in a JSON format as follows:
{ \"matched_features\":
    [[ \"<reference feature 1>\", \"<candidate feature 1>\"],
     [ \"<reference feature 2>\", \"<candidate feature 2>\"]],
\"remaining_reference_features\":
    [\"<unmatched reference feature 1>\", \"<unmatched reference feature 2>\"],
\"remaining_candidate_features\":
    [\"<unmatched candidate feature 1>\", \"<unmatched candidate feature 2>\"]}

Don't give code, just return the result."

systemPromptText_Evaluation.llama <- "
    You are an expert assistant in the medical domain and clinical trial design. You are provided with details of a clinical trial.
    Your task is to determine which candidate baseline features match any feature in a reference baseline feature list for that trial. 
    You need to consider the context and semantics while matching the features.

    For each candidate feature:
    
        1. Identify a matching reference feature based on similarity in context and semantics.
        2. Remember the matched pair.
        3. A reference feature can only be matched to one candidate feature and cannot be further considered for any consecutive matches.
        4. If there are multiple possible matches (i.e. one reference feature can be matched to multiple candidate features or vice versa), choose the most contextually similar one.
        5. Also keep track of which reference and candidate features remain unmatched.
    6. DO NOT provide the code to accomplish this and ONLY respond with the following JSON. Perform the matching yourself.
    Once the matching is complete, omitting explanations provide the answer only in the following form:
  {\"matched_features\": [[\"<reference feature 1>\" , \"<candidate feature 1>\" ],[\"<reference feature 2>\" , \"<candidate feature 2>\"]],\"remaining_reference_features\": [\"<unmatched reference feature 1>\" ,\"<unmatched reference feature 2>\"],\"remaining_candidate_features\" : [\"<unmatched candidate feature 1>\" ,\"<unmatched candidate feature 2>\"]}
  7. Please generate a valid JSON object, ensuring it fits within a single JSON code block, with all keys and values properly quoted and all elements closed. Do not include line breaks within array elements."
```

Note the differences the prompts for Meta Llama compared to GPT.

The first issues present in the Llama results was actual code that could be used to write a script to generate results, rather than the results itself. This was mitigated by instructions number 6 and 7, which repeatedly ask for valid JSON objects and specific instructions to not produce code.

Lastly, one observation made regarding the llama model was that it was not able to discern the difference between whether to add a newline special character within the JSON, vs it just being there for formatting reasons. To mitigate this issue, all newlines near or within the JSON template provided in the prompt were removed.

Note that no changes were made to the OpenAI GPT model, as its translation into R did not pose any issues.

To combat the presence of hallucinations in the output of the evaluator, the main functions used was "RemoveHallucinations_v2". For reference this function and other related hallucination mitigation and removal functions can be found in: [functions.R](https://github.rpi.edu/DataINCITE/DAR-CTEval-F24/blob/main/StudentData/functions.R)

```{r}
RemoveHallucinations_v2<-function(Matches,ReferenceList,CandidateList){
  # Matches should be a list containing the matches, with Matches[1] being from
  # the reference list and Matches[2] being from the candidate list
  # ReferenceList should be the true reference feature list
  # CandidateList should be the true candidate feature list
  # 
  # Currently, this extracts all true (non-hallucinated) matches, all addition
  # match hallucinations (just the hallucinated feature, not the whole match), 
  # and all multi-match hallucinations (again, just the hallucinated feature),
  # and calculates the corrected metrics.
  
  # count the number of times each feature appears in each list; useful for
  # multi-match hallucination identification
  Rtab<-as.data.frame(table(ReferenceList))
  Ctab<-as.data.frame(table(CandidateList))
  MRtab<-as.data.frame(table(Matches[,1]))
  MCtab<-as.data.frame(table(Matches[,2]))
  
  # Extract the matches in which both the reference feature and candidate 
  # feature are real original features
  TrueMatches<-Matches[(Matches[,1]%in%ReferenceList)&
                         (Matches[,2]%in%CandidateList),,drop=FALSE]
  # Extract the addition hallucinations i.e. all the matched features which were
  # not in the original lists
  AHallucinations<-c(Matches[!(Matches[,1]%in%ReferenceList),1],
                     Matches[!(Matches[,2]%in%CandidateList),2])

  # initialize empty vectors for the indices in which multi-match hallucinations
  # occur...
  Hindices<-c()
  # ...and for the hallucinations themselves
  MHallucinations<-c()
  # loop through the rows of the matches
  if (length(TrueMatches)>0){
    for (Riter in 1:nrow(TrueMatches)){
      feat<-TrueMatches[Riter,1]
      if (MRtab$Freq[MRtab$Var1==feat]>Rtab$Freq[Rtab$ReferenceList==feat]){
        MRtab$Freq[MRtab$Var1==feat]=MRtab$Freq[MRtab$Var1==feat]-1
        MHallucinations<-c(MHallucinations,feat)
        Hindices<-c(Hindices,Riter)
      }
    }
    for (Citer in 1:nrow(TrueMatches)){
      feat<-TrueMatches[Riter,2]
      if (MCtab$Freq[MCtab$Var1==feat]>Ctab$Freq[Ctab$CandidateList==feat]){
        MCtab$Freq[MCtab$Var1==feat]=MCtab$Freq[MCtab$Var1==feat]-1
        MHallucinations<-c(MHallucinations,feat)
        Hindices<-c(Hindices,Citer)
      }
    }
    if (length(Hindices)>0){
      TrueMatches<-TrueMatches[-Hindices,,drop=FALSE]
    }
  }
  
  Hallucinations<-c(AHallucinations,MHallucinations)
  
  precision<-max(nrow(TrueMatches),0,na.rm=TRUE)/length(CandidateList)
  recall<-max(nrow(TrueMatches),0,na.rm=TRUE)/length(ReferenceList)
  f1<-max(2*precision*recall/(precision+recall),0,na.rm=TRUE)
  
  UnmatchedReferenceFeature<-ReferenceList[!(ReferenceList%in%TrueMatches[,1])]
  UnmatchedCandidateFeature<-CandidateList[!(CandidateList%in%TrueMatches[,2])]
  
  result<-list(TrueMatches=TrueMatches,Hallucinations=Hallucinations,
               UnmatchedReferenceFeature=UnmatchedReferenceFeature,
               UnmatchedCandidateFeature=UnmatchedCandidateFeature,
               precision=precision,recall=recall,f1=f1)
  
  return(result)
}
```

In short this function takes in the generated candidate features (from the generation prompt), generated matches (from the evaluation prompt), and a list of the original reference features from the relevant clinical trial. Positive and Multi-match hallucinations are removed and a datastructure containing True Matches (with hallucinations removed), the positive and multi-match hallucinations, unmatched reference and candidate features, recall, precision, and f1 scores are returned.

## 5.5 Conclusions, Limitations, and Future Work.

The prompt engineering process successfully adapted the Meta LLaMA model to perform context-sensitive feature matching within clinical trial datasets. By iterative refinement, prompts achieved functional similarity, with adjustments tailored to each model’s specific quirks. The resulting outputs aligned closely with expected formats, particularly after addressing challenges such as invalid JSON and unnecessary code generation in the LLaMA model.

Future work will focus on automating the prompt refinement process through feedback loops and extending the scope to include more complex feature matching scenarios. Additionally, efforts will aim to generalize the prompts for use with a broader range of models while preserving their precision and reliability, possibly even fine tuning for the complete removal of any hallucinations regardless of model.

## 6.0 Overall Conclusions

The translation of the CTEval codebase from Python to R significantly enhanced the handling of clinical trial evaluations within the R ecosystem. This effort retained the core functionality of the original Python implementation while paving the way for integration into R-specific workflows. However, R's limitations in managing certain complex operations highlighted the need for a more structured and abstracted framework. Addressing these organizational challenges will be crucial for scaling the implementation to handle larger datasets and support additional features effectively.

The development of the R Shiny app for LLM evaluation and benchmarking further contributed to the ease and interactivity of clinical trial evaluations. The app enabled real-time comparison of LLM outputs, allowing users to explore and analyze results dynamically. Despite the utility of this tool, inconsistencies in LLM responses posed challenges that were mitigated by employing methods to clean and combine outputs, thereby improving result reliability. Future iterations of the app should aim to enhance consistency and incorporate more advanced evaluation metrics.

The prompt engineering process successfully adapted the Meta LLaMA model to perform context-sensitive feature matching within clinical trial datasets. Iterative refinement allowed the prompts to address model-specific challenges, such as invalid JSON generation and unnecessary code output, particularly in the LLaMA model. Additionally, R’s inherent difficulties in handling certain prompt complexities were mitigated by simplifying instructions and introducing structured feedback mechanisms. These refinements ensured outputs aligned closely with the required format and functionality. Future work will focus on automating the refinement process, extending the scope to more complex feature matching scenarios, and generalizing prompts to support a wider range of models while maintaining precision and reliability.


# Bibliography
## Significant R packages used
* openai- Provides an interface to interact with OpenAI’s API for leveraging language models like GPT in R applications.
* RCurl - Facilitates making HTTP requests and handling data transfer over the web, useful for API calls and data retrieval.
* rlist -  Offers tools for manipulating, querying, and transforming nested lists, ideal for handling complex data structures.
* shinyjs - Adds JavaScript capabilities to Shiny apps, enabling enhanced interactivity and dynamic UI behavior beyond standard Shiny functions.
* shiny - A framework for building interactive web applications directly in R, allowing data visualization and reactive user interfaces.




+ + + +
+ + + + + + + + + + + + + + + + diff --git a/StudentNotebooks/Assignment08_FinalProjectNotebook/parks14_final_notebook.pdf b/StudentNotebooks/Assignment08_FinalProjectNotebook/parks14_final_notebook.pdf new file mode 100644 index 0000000000000000000000000000000000000000..6622407871222c9d76423e2bfd5a4c0c35f47a3f GIT binary patch literal 2624473 zcmdSBWmFtp+a^i^fdmZ@Ab3b1IE}kYaEB12arXd?ySuvucN&7zSONqI?hpb5hi=@R zQ~Bns^FGfvv(}k^v!;J_Lv_`zz3=<7T|lEE`GJ{@g$tc#eEP>cItS|;);A8u*64zQ z=pY3%dkYuKH*6f-oNxZ?g%0{)W$R+*^ak|7*2u+7(#*uc)C^r%7~R>$$;`+O-Tl`- z$%4}&?_P&NzO?mwH4pfFGqoaYNwy?owqAPC$M99(R$r};8`B~hjT&EIK&eqs;Q!D*ix}nJA6eL*WauN}2 z?(cl}8l8zxn}6V8%+jpy^;CaIsFmMJdw`ziT8 znPo6$Voq7hZFHyrtigYwti7P^MtUQV^V0u0p)_r7vWC z|EqI1ecAv6er$t2_VtV`nWgClQz;b% zCQ(`ke{HG(sYPDs21xxJ`)vfY#&O4N5mVkaH!CE|Wmi=ZBXUEEionQM>fJrr?o@JY z!^m$I(UfW~FkID$o_>r^;kfKKauP0#>iws=dAyQJoi@p- zfozeljlQ&5Mrk*j{7smr6pK`UWKOzWXT~9H@0L;dUT@QTtu)^Yc3;-3UR#q#R9ue- zB(@<3XNz4N_=J*B*%XQnz8xILss8@p09|R|QHp{~tKRzj9OlV$W2iPAhBhOpJRIo1 zNzk)y-K>73afJU2%D*s+$z5F56ksH)qne{sjy05M|1r9X2Y>a^iKKTi^`u_4%;)Fh zo+`@<$>xUerr%6**AU5V@}nY85Af+bljfevgYc$^N(N@~;8j!hm>S$3#j>{`Te^LN z9$$)6+dnsD=q|DCmfCSM{tj%=vSEOJQlC2(_7f-c?C`^wokJ-?SOTLZ0E71el}7x9r;a_m%GOJ)<^s~ERmtJ)KJVM2c`Lq-;eggE z@pvn@F8{tC{A(qRByPgINi!>>OAbk~<6$Q^8MDPM>hObbO)Lp2t7Lj&#jQXwv? z1k9X(oFv%`iT6u2#%m*`vlD%RJv_QyS?u0CBbx5Xrt5{&qO9&>Cf+GB6%X`P1M7A!hAbj2r@h|QQW-*YZvFMjVsQAZT zpoGPJRXgzb9K|X7reif#;D;UD|^EFgMh1%bAEoK~RRk1pnhxrSAh2`#|sxF6o zloeKy@@v#wX&5(dcFdt>e6%!(E7DUDn&%Y{AxK3RCL?mkF8!L4Rn>thJQ$)L`$(Jj z{1N5bw-9L?hd!e>E0`E2_O*WYZQ7L*`giX_d(vH{ek!=}`W9`IK&*Jkv!|FwL~|u0 zIB_1|)nX+l4c7Z1;Zz*gqnp{A{=e`Fy!#8yJpT)fX6NN&|KG*vYHjVf^*&tR%bIZ$ zjMo#k%V$wW4(-RNHttk7@nO8Io<*s>m^c~zvHe}b5Uz#fzVE^tQl^wu&DUf%tU}W7YNjgg}Ozib`R{#2z0lI#38?W+A>GG_mdSgk= z-b>rZ`>VK`73s|q%%4m@kL1aNaaFK*Na#?s3wX3uQ4&zav0i$%&oA4%$>K5gfD)_) za(DI*)n^vp#X+}fW6;lSM_pSC;-wGd0TB5Z8d`ct5? zq!L43hCzq4z^gPZuQ_)`SQ#bOB#iIS&aH1db7~wm1!Bhe;Ps6oA+Cu(J=0BVpY;NN zn7psTCPZ*Bc!dmB@|9VQ7%g{q|C@sLm495dLG!clF|F37K*##1!aM~v?x!8EpX~6= z!bL1(QXz0$Wu?JD?@`YA4@K*nZ&E@e;g z+ojH*Yv&A-Tl)h;Us9u~F@L^2$E0Bj;xx*TPHolLDzB*g{C+o>LDTN09#XUK!EQh8 z;*ue)izCRUKTOx_xxkV6#I}oivHuuD2h<_-)dRLV+`rJ$Ohto%vE8#VIFo$N%_ zI}_bm2a}ua3ZXIa?Wl393^Xe$o_*8xD3)>aBaTeO{tIu#cZx>4rX#%PuAdN_Dq5$I zz?jRF&z@IY3O#OdN0n04Y!|6vhZItNuAI^wB>oZ9iqoJ`9tO!LZ}gDyorpm7aZh=M zPmD;i)!JNBOIl-c)gzl#LgOms&Ef1xTB_nSosIC_q$app^_~Mjz;G|IzIUARM@LNU# zLEf5I^o%n?3>?^uoG$dIm^vdUc%u)UoFOw*%i)kZRY0~EOKR@;tzDoc?@XbBn_w8v z;gK^+BSpv@F^?D-cU-eQ-;@}1Gd07w1iOh63t67K@-!eU7g^+}IBRxSSbDBO$f4?2 zVKBa7gkF$g&Sik{9|_jS(oD@P<~{Tp&Ox$GJhsn$#Dv{UyKO2fR7L?JNB&yv7^@dw z0$GUF9a#Zx#|QhM}>QBUBkD77QG^_jGPeIBSvV^a6lo672E%8Awb ze3on34~p<^S;F6GXr+d^VZ4}7i}4yUc3}nK4!k3A$@t^>{puI$P-ZwxDUYcOu`sBzsV48jqd#%8Xf;WvoML;odSQ0TTB5$_UaECHGlTpp;A0?Q~ym0 zZYdix(oq_KOh`8VENjYS3D(yj(`%2bIZ2l=bui?3q7 zQiDq^u@%;2KC4O^CJL}Vu@O40P>&qqDbOZ-@%G*-3f*##?a$j6E~Mr5sR0hTB4*9e z#6I33k!u3Wu4b{GC)=dqj)_4Dl2Ux~!ZKE@=^vJr)V=<&KAVs;uJ}lK^?8jk6F*P| zFVHc9BpWY}tA$KzEc$x|FN~3Aqi;?cr;=R)v{`8Id@rQl$e6PWgFtDG73*7NcCWeE znpmwn@7WCG)wBrf_}JG#w0r9M9Kqc)Nn6}pLigx}nj}YT5{g$he-812gUC@uZi3NI z&3$#FCt8qPTtM-jSSj8MNS;t6)!x`4-f-t9QsEL|*0tXs#S_TC$xzq?Bw||f6{F?KQ=W}|U_m?lnZC9op2qi+QwO-=MpCDB3sqazZ34M&Poj4Szhte)1 zTIQUuD%+n01#8Lm`i|oD%YU!kK9(bAqh&U*#WkS#bcVInD@#&j_1OY<)fPAZ2KOaS zlqN&{*7+Btz|m!QdqMBD#Q2kdtZtBtC2>Vuqg6E7AA7RjOR!n&xTRuzGO1S1*kF{3 zBT(S;GS-3a^dBdS6w&h7Z^;F1wECSaaZBa|V604!-zM=C-XylKvLr+&{JnI#MF^-`ymF_!Sl*xngv6prN;35Qz3LmdfFrq zOLU@$BhtskFVK%nZpU={MVpFh<1Jwb#UHzd3(8c-m2(b<=pVnXv*NhyG?yolWpBf1 zYB?vfOJxrfdY;P0aO^lYdhXX4FXE4mmO(J%T-`DbqE zRZA8>tkNY~p3NJDf7?&)STd+$nG%hP&x6AGGdx-7ww{mDEu|m(GX%-__mFXHk-5+(`^5A0PZInE_r)1na+Wr%1j%sjrg1xjkFiV12jomS zOT%XXWic+Jm9nqnT7Qr)FLq5NlbK8@u^x~}9j3{x!L}(*)xxJet1j7`XTo|bdAU$h z7RD8WCXT$v{h7B>+KR7LKP8fCL}Flrp7}D z^ddf*ay&hiIVpde=4WJC+1~DTw?B1uw%AKQipzyhrrh4AK=k^TZ{Zf!hp6V}x9>_R z7>pk9XSJAG9CvPxcpfd8{KR0hZCnT3pE=bx70KSD7A^Df-B&}hr>-DRvhYjfWJl{V+#;{M;&<}sh- zlNNcgI{|Gz0kvi;poV{KF0x>s(#@$>=+|_fw=K4mZ{kvv)fbvvJVl_r5taRTsnO*9 z@<|{L5+#MppPcq@x0~-j`H1Y%P7G5V`N? z`j}1V5ioFy5*k3;A&Bcy8($;#V1Jo!i@nIYnB65L=zZh5`|E~1ydO4yqx#COBDb@r zwdgyivn8n>_fAE5CjQA$Jw)ecLS(i9CO!86lBz)S)0~Mc7X9vFR%ER7T zGip_sN5oSJ?B^|5dU;XnUe3qHoa)q{^nc|F9meXsLDOe_2tcgIl#BLk4L*f0^;4e| zy=qmX5+B1c@nRNHkV-WR?vM$tW!KCS7^g}!EV35i5BWHs@;08TGF>jAcT&DU&-Akr z!@=Ijcb!>FdSWb@ZC)ElUYuP&zNyrBpY}vuM216dbFJ*6P=d?^ce}w#=4AW0dmKB6 zT2+_l!}Zo2VXLdhi;F9Jv?b>1tI1f1wlMB-IeaWW7%J^8{hP)~+yfE2S2rNgc~71Cl7{l0x6 zz2*Fu?c=v+9(b|rGrQ5A*bUOEK3D~1dq~DKhg(Hp(V>uKwFp7u6@-sj#S>-s)s3mi z+{X9#c$fFvUdASQ1&zd~%!)QQXUWgyGWRCJdM3jsP-|DZUYrfQs;a^ESn+b67HYvteOWYKy~K@>{UL5`KLdI#F{@lxbRog*)h3mTUTj zvuM}P7FY%*Ivm^n>0QBWc<>p71e%Z(AL!4C8Zbl9qyhm zq8nD@7~_F3>?>gUozm!WofI`xfWoZSnqqvcUKztk-ajL2rydH`#KS_$i`RyYfrqzDES)e3x?kHw<#6GfQ`481 zvnYRlO<-!h$#f**2>;U*T-dr%==41P5fs&%%t5iQ&G{bFx|2U!um0wKZ+R*cfmYR(antO_F%dd{eE0!gw1msjbREr~TIh|i zCf}>o?j)5elMH{y(r|q;$L8DXTS9SpWulzQdB(UO3&tsk74{9bJ&#w{}C;wfqw z|H>z6kbCYU;EF~ekspZ_=Qpk+W>-XRhg0l3FPv53k@?hNUqFU){JZ|SKn7x3ZLu78 z=$w@-BnKJW6<2yIMv-F>2kOS3qPWm{CP#(VoOOc}Nn)S;=i=b$5_^k|t|;r3k)Qg_ zI2tC)2_Jg?SX&G6+!Rx!H0F`ANo?)4T$7>_j{%G64)O&C+g2V?=*Ga69r>rW$fJDj zN7$m6&reMfTF&jd5ljY1wNSx)0D4x4?wT0$)I-oj;h_hh+XjHdkg2OFD%7OOjrBT|#UIOkUo7sGCGM{DO@85c` zSbVo#(WU|lB+yKJ)%z2hm#AK)s4o*iGm4Wnx=-Abo{Us~^W{7uJkg|kQNPP&UO#Tk z@S+u4LS0aa8(D6upOAH*f~2OXWir2A@fRy^Dv7-Pm~F)^{HF$;8G-5njX!OUZjVIO zAD=-|U7ox*q3UBK`-BdAQhqPD4C8~vRJhU-4EOB>}X?3$nBZ8?CMmC z2Cd~^Yek{`XL1~F0+~;?7lcu6xr*l*E?(=Bl?3J0&rNqxnZ;bq2I#5jON+kVev8@K z^^G-7)6&4}lN(jv*Vlx-QuFIt>(VR%{(7F_e}X$0>Yxuvj#bdvc+Y!GmU9YbjaLNv z`1pRdw-y|;SSvk!Pn{sDExPl(qWUkeo(Z}P4_NE)jXz?M7it943)m_rEX8{rIWV-4 z>B-1hDj!Vfy%5J#?o)koQH{w<^*z!jG`h&EWPnB%=I-@olONSdm=$&s2tvQB`SG4( zn=rlZUdP(mOgY<1z06nfwyLhF-|-3RglWxsuYcP z*D|Dx7m8``?)u3V1q7>oLwV--oHgxi(_PdbRdfyF2$0J{Fvue5_`nyfWGKdVf(So z;6Ca=4wV~)VIaRar0)BpfHdO6F{sZ~s^WjvU_2cE*u&EBa5Q@Z`lM`ZZD!(v4pN60 zyZqNrWbKSB%+NtvR;GV z1$nK-$Hy^|Kh;|L9;+cl`p*O3h&`wJ93HMBiT944fr-SDS2Z=KWXc75#QZ(&dUkCc zu0+LDk!6r?Y{@1MsjcI{d#>%{%bV-9_D}n}9T#4c&BBzDSIb@HuO&oMPKbAESydT>BdhWOVP{jvYv%*x}#+`_`l z#PR)US}YMrUkr@)ucOFb6BCzLSCf;F7+c+M4)O3fELW731^>6J%pU`nh@%v7b%ppn zOdhXFAUeD*;fOC>Tpg8uvqwj(CfU~L|2l7ad1a-Io!#x@q^I*@60PRg7ZC@0D{?Xt zE4%xv{MOBOUbl1oabMbfZZ8mXHIyPvaZ%gvf+Tv*M5ZDCy23XF zY3csQhs!Qtel9OB-Q3)Gnj2g%&)|1n$7|2zaQ=Dlw+1s)Q&USzOBa{qhsUGU9#WxS zzkaFwd&U0wh5y^%uh26w{Wk4MW!BY`(ljvW8qCeiBr%xeLc(U+pc_(<)`}?C!oWUsc!7eUOosM*C$N?{N+%Nb#1J$B67cL}^%j`ky&r{W_e% zu}$msL0TGXm6VheCq^9%wrhbLuStcKJv`hx7)&lqPgASL#2{8rCe-jsU0C$nd4+|A znV6!xd;9uWKGhoDUhGRqNNjxz2?-&a>GZh`)V|tM<9FUtV8vI_)%6-kXm4+iw_k2{ zSW!|^ij9cKuPaWCkB=9v?p7I?=H59-q>+E`q5t#ePi!m4z4@<$N#@NfXLjvP}J zMgro<1@_DUSIrbBwgA6M_e`ap3eMb4! zl{=TI=*&494u$P<`}LfG-?>o5Ib!ayNcv@@7KZ^8#{8dQA>}`-M4WXw*8&6I*|F68 z&QXa-^}fHE0l{zP{b0YVwCk@-yuAf|q9Y>MP%5ENLNoW>$ztH8sj2zw*)wfz?H4az z`1ts2rH6%v(g`d!JCsf<$uN6AKKO`FpL<;%FgFDbrBb#9TlNV1-nE!g2?*S-_JrY3 z2=aPgIez%?!Q8xrggp;_v)tizU+k{Fw*IO}M6FAuM*SU*_%VE2zW$#J{nBZt43P#h zAXF9NC4DX5RelKNIq1V_bF4+1c3x=V+ncqW|HPkir`x z#Ko1ew$NaO?`QgjV<-*8gJQx3YTyPdGeGVEAj(c}X=zDFNKmnz5fT#04*3y9&DAl^nCB?myk_zP32Yc&?jbf*vQm>%pFLb1ED10KoSnJc$ho>++}~bet(MD_ho}$$xal`l zq6oC?;m*zxKhfHOz}vUo>&Q{`@&LOzDIXiQxu*lgVW>j25R4a~U+)H(KwDA+!7VG1UxdVF6lax5({r)lJ6d zBso0?44R~yv?+O|(?^%uhYGkyvBrz%4jJ#UDTVjQ)TYnBfB)`rx<#C5bOf4ST2h>` zV+m-qnXACjwzjrDh)&AL7<_VYHvqIZ!bH)k0+D$xbw3SlRSP?zG06NEFvPzLo`xs_ zvqelu*e60lO1d~cZb;+=49&SOgUuwg1m=72n{UE*YHCUfpM{Q&b%URsJ<(x;vVBl{ zTE1oml2E?BzFvO!w_y0}Y;Dgvc5;Qh-bZ#6x1F4vIPu9k_s5d^-5sOJPi-Yf!8&c` zSjwZSs+d^VG{I?gG#{&VAZI%hW<#li1Z^?16s=C1k&=n2sTcc;JZf@wFOSn|JvQ1Djj*U z-0snn5fc%C>p%s7lDDs=scBJ>wSwwgMt8S3OV}5`2W`!{+uPe~Hc5APcORb)ouE72 zy;j)>PsVJiF_yDdZo49X8>D?^kjf7xPfZEX$xGBh~ICNrxqx{|Vgucv74a903TmpNAq z0|>QKsZoUc;Vgj0>;_OXD|P1d3=$joJ8s~v&d|`%mgKT5UXd_#@y>q?_BlaG1x%Q` z6Xf~$a1Y0Z-`#l+#oxd^#l-`~w85R;d=5CNj1v*<0NLe8#Km7Wo!{uLw2;ByajEs2SulCet4JFOW| zetqGqoCiCo=guOSv=x<=9W2FIq};re<_x0}WGZ7g1dxulw#8Mq1tLXB$>2?p z6Xx#jZYNI=t#a;>I?i(8!`ZHS>kqi6oLuB8UxbSk9pF#WY^!2ouHB^yD z0J_?$LGY()!U_N?<_Ng$>4*FOQ_Ox#)|w+ep2Z_3w(6{`fp+=a&N>d|LAA8BvWn#; zRNeET1~3=dCjI_}KaDVb0|QW9J?{?p8#RuQuk!vk*1!ynii|`;VW$Qm=c`3U)Ljh$ z4-dYs>k?JE%}UUEn7!-khutY!pw$^_UD%S-()!&P85pJ!|243hubX-q20&@H-zSAI-`fkjHu6zGi0D@$oRkniQ{L+`!Hr4)CE)r! zG%SoS+FX4?V{UsSs~^4g`lErtms3|yP`uMe$~wym#N)$lO-|Gm+m8$ndHE<$AArA` z&uOBgqg6>L~;6DJwkFP27gPpye;bNYnXu;&sJ%B|Lp{8na^m+2?<8w!SGB6 zA-{*;UHc1#FsOk6P-3YE$Bc}Oe4!Fy*w2^^{?W9-=zz5F^Yd$aT4~M4$LF*OFre!S zRvm14ZZ0i7-TmgwN+R*$?o?UngPvaYwJv}VZ~3HAFV}D2{0Z^#-vV|&%F7P{5QQtaKkcOZjKCN`T2grt)bd)RM)B9D$(!TD9y+Vy=0 zj-SXG7#QeuD!4R{O1re6dV>=WCqp17umL!&-mF}_eE!-P0%@GyyS+TbO5o(=JYu#c zn0#MOVE3ybqGCR+kjpxmx7eFib z7MsUjR#jBUVTlsIa^MIig2VQ3+(aSNN{!`&=C-y~&xeBD;5ThU6mu8#ngu=-yod9>CiQ zPB8peiU1Hulfm zVpKeQe88VGb)Daw+gLz4ii&#eolkx>+F${azJ;U)z6`(^pyKRpZ8zeyYV^4zCzqD& zl#T+B(S~LJ>adD5c7W5FSy;AWR-o}d>(~vAjpYU&mj_E@B9FGu@rj7s?yf$c?+q17 z$HBjN$}Wrz4|Dd)J`-SN{R8GGtEu5h!>b2)9PsBA0rbZ%K7e5emFbP)4U!|xU`$i%_|-{9##48}JUcAA;<7l4?MUh1v}v`x3v@%^nEr{I5<776LO)h93x z2FC3I$`T1V`Al3zd;4!PS!3h12DQ2Jmu0Mg3tj&8i-D-Et!=Jaw|}$NaaFwGUpdWl z$R}`fPYTTUzdUeB9E?R98wcm{dix7L9-ao;(!)vNKMXVylHpT&5U9&g;`8UPm1UV( zS)h3nVH_8^6F_B?3%I7^gkxf0=yrHE=1jf*=RlE=0QeU4x~KxE2Lw9&-RcZX{L<1A z-9N1FzmNYv^a)?w$Y}LFX_M{38UtXrZxkm%r!W#3D}{UJX^N|bMd|ecV1a-J0IVz8ir)@^(r1ikkI77Zv$G%V8-Q|~;&=je zzdoUHdz_e1eU;Wv!N0n`8a$UK3{=DR+RfGR`gBzta&FRVc0T|aQW~D-4ksw`#TQ|31yNnyEU*nF1%g^C zk5dW2;MG~c)sh$7+#YuM$XckYCrSR`i1OxiV`OKToSeKmh#}<}{)&Z(IsWT39(Bt$ z6lHqzFRO~cF^`<%JNbM*5dz#J1a`$g>?(a^=Ud;g2 zQ);SQRaW*VQ%*slNY#eq{Wk)jGdD{%FR1=-GO@(1uTqzTNCwUR6dzx23?>1jt7T=i z3vdx;A$pj{)lt5g;O6EgnSW}=*)cHQ(V+J4w)QqQQL(WlMSwh+YUi**AkKiN`L6(= z&2#|tSgbt(0l{%Uu)IUlyOE4qP)A&k2C&tK68cKJ8?W+c)?N=W)HNWfe^uYPG zeRLIUmtdM$e5|asUYGXe$18r1OD{D6qr;a7K+>%g`0(&>+r6Nm;0ismM9aa&kY9j*d7L{29^F(OV$od`_mGvuVPkf7hu7hf;VxMh8IVld{W+LTT>=Ook!% zwe|Ixnax(`?QFDbAfb9~e06ut0cy5_v02_Gm3F$L7O74sF@{G)t;a2wBa%_=mnke_ z0Jy$kK0m&k-Z8+R!8|!V?Ro4;LBC&g=#ssLjV*{#$26pm!gq5QD$|2gQ!{d$YGXXl{l~wo^6Zo2^1kP?WQ6Lp+ zr79hQ-P^Ki&_0wVtlMl~9IK+Lnvd^3Q=yq13dqd1`}{YR2#e9|d~ZIwZI!vwBX%ml zjJHKIb>>0C=&GPokE7o$13+70IWYSOnHu2=#M}47nOyQIDwef zV0?5R_Ot>zMK;o!KHp3Lhr?TPEX>Rl$52`d0QX^|*tA>F(*7EUR@Ns1rc~%@ZvLKF z9!TgazKwpD1k(;X*mQbds~w!cPmb67qaG#RD#*ztMx}!GrWmr2r$psWe^6J|h*lyv z-DG5cRWo^ddRkH$y6do|+{^{@1Sw{5TUCsoX!ro72ge>>Ue>pkPqO2k9Ggp)>|gem z<6bE7s=@!0_tz6n7+V8{Zz^-km;+$bxhEa8ynOX=vcl+wlNlarGh5XPK@j#~MU&#U zDEtAV^E0@->5KW6{{`gJq6Ab<6@E)~wH=@h0Or-77Z>|DXEBzmC<1H;2(}>s6W_xg zuoeKLnEW4kU}}gnV6T8Yjp@JSoQRX0pfJ(E=xB+r^a}aw*N)GdTz1uzwqH4FN}oPw zZWx){6nZ+;roUbVy-nE71FX4#K_q|?9|TmDUEd2nfws8rlhCBhR{MGca%@J)I}2vV z-{3*OOD@tshKK-BdmUK;%~C`Nd0o7oh7eS|>x2M#XPoT*!9mG&h+3QRk4V{11_2-E zc2jIon-~}e1u#}X zj=*6)e3S-~_Wz8$9?Hn^{w2eisbTjF#K0dWe4PPwe;Fb_HWrqfl{J(~HB<_CIUp2N z6KtSckEmfU&!ZEruyEeX8}ZX>M$cxi5o8`7;3- z%(XoD%La-L7$80kx1zL^l#HGu0fPohYkPZP$3LQPrq8WnJJEA;bCsxooT>CvX9=~+ zUE+emr|eeTwXpbYVBk5y5?~(G)zp9l%3?b? zqVtoAN_Cnc@GKNN?z~t$su6HR2L1Ug3WZOe`JQY< z%$*bHsKopgKtQT9ZD`5K>$0OFD;u%&tF*BZsE&Kn@X0)}^{nsv+z|cUk6W>KfS5k2 zR`HgzEQKD)*1|OQiA|jin-83WVV-5P&MB? z0H*ZZ54!5$(o|UyIjk%^QFXi+RJX(B`>SIUnYoZrUgf0s5QX|Azpunh1k64JncLEF z4yU+)VdnIia_oJFsTuZNn`{Ob=N*vYk&=p>hlZ{JjPsM^{LReE0$8D8vTxf;!Lr;F z%NFM>P^R8a)K+!q86ulo$K>aa*jVh&tIZ*z+$^PB!B*2jJQ4>h?yuel%8e2nS~@x; zW|1VTfOIC4i>`Xg5CFtABz=Gr_?ux4H9XUxv2k#q)nx<%br|y!2^m>KmW|+6RG$K* zu0*GVFbk-%ePD}V(aH9Vcgf~ghyBRz%gxj6F(7X)y?OgC;{)pQOK1fG$U50x58my~ zgUn0a;4m_CYd4f4>C;8hYQ5KH9MiR)MoR4n5(z*W?y=-$zL*_o}7x(!y5?u;mZ<%+dj;W9UQl9LQ z(Qh%NoK%wpnPfR>kSWvOM`hw|Bs@=54fL51E+?DTFs&3iG)RsoKUR0>+T?MR%>dCM z6w>eiUHc9YU|VL4iq zrqoA7`@0P)RPJ5JY&THz;T6`T#T=PP+;00uEL7azwkXuyGVs?7~F4(I1m?QHZo zH6J=ub6$hI^-Duel3M!xyXppOi`9HF5fz{BTq0e-4eGaYSaLDOmZZHSFPN-zjwC=p z>}4fp)MBRPB5rRFZRu=2U1NmvkGwZkB@N7=TvxN;Nfx0LRMpa=H5zzm=fGarCi5?e z5I~>)hUpyd>vJqmIhuj4>H?3-*Tm_9821@Mcc!%l#A2jlOF0d%Pq?&$w&JJrHL_HG zIs=poceAn^n6sM1{83k@J_qQ?nbRu3!~({Prsz8w?=++yrgRaP4@p=x<*vWnU+j0= z3g)AfawuiPeY_7p;DRV0S`G`TPS#gs@eYwd(kX}X(=SOG8pTol&NN&Ep z?KiAE6gLF |hK&+bj@e)462qF6;n@#)baM)_s1xV&_<4nZ?Jx-F2N=J`^JS~ZJ zk@)9s4JU?qtLRsb;K7O+IIkazyq-gJJ#?$io4DGKuy{}K2-prH~r9S-WSAN?R1yh0uZHu@}Lv-$Cr3(wT zm^&vmvCLopkWC7F-BU6B2pG2)3L(SRsA5wz1XV3LrOEQl09v2YDvy)xOZf=-+-Ml* zp&UKoOm-U8lEER#Jm=!>^ms!;t!a`>|{*(%dJ6%vfpsxu^$_bL{{*m}RkS zvC-_p2AM3wFkq9%z57E`jY9X~aJ-}~fC?u27kKk^Um5-^LueR>W=D1~qczH#tQ&m{ zB`-su-hTY}X}VNnk3Ki#HlG_Q@5r6`b znlH-tpT)hS>#t=<))#dvA09-F*DPA~UkB8i$7@O1;@0T34W9u(Y(w|3=!HmbsMSqf z1~sy>vJl`-Nyq}QDH&Z1Z1a-Ui+vc+J~-dk*V=L!pu#8o0#aZ!$c$@wJMG%?dHq9E zd@&zinQ5kZDc!v5Cs#GqExzbYVNIXYJ-~0Ntv3LFfmXK?I!pyRDdw(CVbqdaEKD8{ zGpzXA!jkemL^LU$=+$B^s}a$n+Nluqen6I~Q*63fVbY#PZwbJo1qDf_AzdKEQ+|8P z%$z4AJZEVTb+6y!8u{QD(tt9M))0SJB?8!2KAOkfz4tHjB855#p=&1GzieN;uzp;1Ep z2l-vq^KCAiy2W+j)!K;|gGk&{ebjUyXR~_c!O>2~|F=pFVvWAbJGS zTn`Eoo>>09-uv)&x!sCR%UwG+rH=5|`Qg4kxIhBD9c8*$`7vf8A?I)9vda=LkCz^p zjDMp?8k5uDJsD2>s9-BhY}*7pez1#i#UtQ<;pB;pKVY&7g3H*N8yb!vuDGJ_(qprC04wx| zek%UEUQs5E`tQ3f2bnyPQ8XEFDgT;>ocRaHNr72%u&t%#%L!#>JNDdb{_5XYLp;aW zy-+X9>{a4;-yiYfYru9LTT#qyO+MtA2AHTa_a51CNJ}eZ4hmjpZFrrGdk$>&hL)mY z9)@iss)iW=d!bqvp=G9%#2#UBf-;)>u3Fc4Gg7Du6yVm*X%B5b*j%$0IAbb5#5SzN z>>WX;p4(BI(z8w5Y(!QLP~yLXCGj~pt7+s@KRpDKnezh=5DXzfhdCC=`zNtR){Rj- z(?HJ1@<249pT}~f3Go54$w?y28Z*`GRO}Xa<(wcG3yb!!w2$gLEjNW{#@a+JpLLMfGA)9mWuT$&d;}`$>(zr@X3hv`UzkId*5MMl2@|HKbC|u6&!r&XXg1%=Bsu+H z%X$X|atJurY^rVzkW5ZFUo?Q0xwlfS_h|c>*6vwrK0o&8|p2AeeF;8jzkc!HgKhSRC?r!1XqcHX*zw)L)HW-vE=;8DelL!iz ze=q55MLJ9Xf-`PRL*k@y!a>+U@7O)^99me!*=C+IS6(}0c8fPdB4GCZ)&>wlhJCs2 zdWL~M2`8tzo^pM4Ww6^fHLw=>?#nu4xty!sm&PKt0k))Q^a!MJ~R4v3j#ISlc^;7;J*Ab6?Fc)q%90>u5>UzMLD8?wH9p;Q36jnzb- z;G~b@1{{PViyL6JM1_+wU3vkScK{MKX#k^;;ThFDuR7zN01AQ^+{Cbzu|P|_eA?zx z_{haocvhuoQv6pZT0w9a(GW_r@7yNq6d+i2f8GMc!+_=ypP7@xTr*9qQ?lq?PHJN& z$zF*#VaH`g+w+e0%7+d`X;FY}P+hdQFaJ?4n*A|<$`B6~pnEg$n`y|Y7N;`Tc^OnW zceA2%1%h^}z#m?@Dzg%c=Rl3xx~L7lY`J*}Ob&YAgO@RBf$tX$0)BdW*^#H!p;w;T z+8Mf#oDxARWK<%3a|qb)uVB2nF2Mc39Yj1zhkLJ*ladEjriAz?c-YC2Q*Ic+Zr18^ zK*A(5n}Z2PTNUNyR$4D-22Mh7eJt-8+eR=jFnpNa`l`tySHbCjNJGHpob?-PhM~lQ zlDMHnX{Nz{_knKXYEi^Ij=QZPng)PQQ3aDvzQGSqvJQ%_yh$9P1s+Fq=GJO2t5Ff7 zxT<@;S8EqRD+vdM;>9;6F8CZo5-?(p#Y(^+vLXdD_}aG>o(h!3J4LvYD|dlZAw^;l zTE_tHOv!;u6Zwn-9WrJhb@NzZbK#|)=S%2Tk$9i_^n2Fi4ZuWJX_;plb$Ms~*Dti@ zJf7Jt6BS)x2)k$kyjh9~S_c2H7l4jYIXsHc>Maq13= zN-%q>Tu`jO{NAF!HgZfT{LIFt zVln8n-F@tphAc&y1p+%t*c^?5&A$Xr6_z^8w>5|DI+?&#Q!DcQ39ED>A}gc%CsN>2 zRq$^zRljz})xb^Jk^ueLYq#HyHm<0~zsC(`6=y!bdk%+DBINzFiFDvM8`@By?y zag1Y!J(+#i)6=u2H{`wc{}A`yK~ZM!)+h#05fBw5C?X(AMv`PuqD09-XmSulkkBLv z1_S{mN|q=YX#^xSNhIeCA~{1hNwCS`dzks1GtND?>Z@C~?)~Nut7fLHxBK0DKVdy< zt(~El>Oz@r1*8RP_MP&%>ByzFJFN%dBDBp|Og2`b-BB9(S*rwRE-=Ev&Fug&o9+OU zlCx{V)!5T>Cz{ouPdl@#x0hj5B~pZUno>)Yx8euQ?bIqY%eNotRSiR&#PV(fG3Hmf za&3@nKswb&p3H2lnd{p4wRbIFN=jTFLBX<*YHc$gFg6=Zp+rLbzNF|}*gBkL+mB2T z6}|qRI^c28#^oH;1tOwoY32KO1%8P+>Q==I*ncNg`%R3AbNB@~zOScm!6CQZK1n8e zWob`&%-iJ!FgQ!sC@P(yJexBLQIR9kEkZVbjG|QGeqKSzh+cJyGG;g7+}!&x>rK zMHxf7Io}BG!DwEjIPMv8R3wN>gZsdPN=Bpl;g*nd9;>5}Plzd1dx?kYVC5ZQTtCu{ zWBQBYT8D^w(PI{>!g}mo!`OJ`zMI-tJ_P~wWrfNWpi2VjSQc$$2kHOqf<0_1;P?s^ zU!i`#_8_v2=dZOw0&|W;M)G|X8dbPUsV+iTki}*jC^5Jip4p`#;Hw8YVO|dof@2QK zF<(bW1K@$(-W0EmfzBw(%sHD&FwGUp1#pZ^YjE*Oxyy7^{7_-YPQ7KhW&_M+i7s55 z9SR?hkj%f=MD|<3XDYRIe@TucV@(3_P(gv`V@(Z=;+oFCzTW<&s!EI`1}2NzGpT~F zIq%cvYUN>ba&zCOQwoMOabCyWK7FtoCeLLTL1sv3f^tm8gy=c|!`tbczZ*}p%Kcm)?%;#HN3n4O!8oet#@ z>)SHU0MMF>1`@2j!B@-9>$cyVZyUsDQ|b3%N^dJ)===21Gvm`INgs;WiL7~{m4O7E z+JM%XNJDv`zNqV*0}XS`czyv-Z6-5y5GwL1&m2Y6Qb_ww_Pow1u>lG`_E_Tj=l7UM zpw0@s>U-ZP`OUdJUQNgPz+FNF%oFQIuMrCqNPi$yD(sX`2JU&&Zzi)K}++tq#EfYOBKW=Obw#r z*I6nz9Arp%kYWy_xo&p_1ejWQ%;j&9NvgZVakeYFVEUK^RWz$tAQ)qw-Rx8wuY69? z1%*??zap)ER@d&d%nJ-;(0LX^DB+Oz?p*s|<=0q=4Yw?7;S|gcY9ub@?$jrijddm) zLSih4?kA*TGqH31ek`ob0l}92A)9+<84}({hX|5RZtvDl=#D%M$|?mOG4XgVUJEF; ztKG2k3hO3|Q9Vm1Y^?E+@xX@y!eM{mCYzy0vA%gzAaK2?=91sFjIX;qfBw9(DR8?9 z$QGO0J58J2z)x}O_|f0bDEt0-2C8)_&w1u5zilQX+x2Yr_Af97T>JWfIdqWc3<1G5 zJ-a0vtzXRfNFvqXpf`yAL106&GlV90Hc}^_&eJOiYKyq{W9@n_U;OT+s0mC#?9XZ+ z{Q@(9s%NjnVy#X{Tcqy1MSr_5ZR|#vDuC0xTKFM%wwoRDJ0_I2fd5^rXuCccUE-=7 z(AjIZK0JJ2Ka_=~Ke?umM1YJZEPH%yXqeOkc>p+v+3D*7&u&B?26|`=T0NjBkw`%S zPq*1abeEy1sK}#rg;>cFi3lo7{)%*R+OFiyq_`&H8e;u&fTHH>wcm|yUc`~sN_9`I z*$WDppu}N|BzDJdj~j`kXJD{m+JyF7b}fZb$mvY*4tF()qWWuVPd6x=dm@q|k~XAj zPq8INvlY;-;*bK#a^Aa}kL^m;AZMrLp-W@`gI55?w$2jc{w!K&8U`$ov}YC*Fr5Q0 zY`*IjTD*6x8i}=3BU#drx=Y*BnHA(XI;WtHh+@o`J@E!U-gl<`0%PNc%2*gn5iJyO z0FcI7$U{2qdstwb9dz|F3U%i3fA*7a38Mg|2n5F9ol8j3u2xN!d)L{Que+B0XHS7b zI!mk_c26amM>gnZpP^@F(I`u83;dNgeLQx5%kkAdu1oTNO;}u1q-mkJi>6aIF;( z*XfI8*_=ZvO0)OUjh;Sb55CKdpCeo{IuOO$loUpbn|;7svc}*lFFq>~Y4L%NN>anq z_ox2@eqaMcFf=rEh!kwBa(_7U2HtFg0Br`dqWV&rx5B#A+R)tl67>evGYj)qNKfROL8N@?^lm1(Bw;2mX){ftrw{mH&BnvG3K*;r_&V3s&c*%cTQMk4sO zMKMO#Hy9nb7$aIXVb)a8JWhd(kBuQai92bP$Xl#3oe{|G-MqK{`wX!X8Bg9BMk**W zcQ<`mBNW9Ace4X?7RMq+bL^D;y7n1YA~NYg3=kI^>$NjW!F(?mBg7tB$`IXXP||vO zXiwI~94V9gR*0s)lXmuwdc0-)?p(~0haOP1cghS*Oj3k~(JfJj=NZ7#hu68<&B)|679Py%o*z+%RImdkT!!`ak zYC9mrew8dQ^hGwjztzv4F9mp_aZZ>znNC` ziSC?0pjz&lVscX=fpe#_$%N+X{O!k}BplEnlNEpJ42eo&C3HngT@zSt+l4$ajhl>z zxIJFF0)Hs2Cd|%uD@rv~SM*wrq3u)#FVG0xtdv2RzxBTU=wY}1+Eo4dg!(!ezH8k% z;_TLU0lW*l&8>l(iue8Ax;_uok6x8ydLYtEb7cMFY*sc(oi*&&z&aFOYAIS~VyGyB zhi-|nR};`zM}&FWK27VlYL|$lRHdb@G+bn?xsT9JwrVYyA!e=s@|>|2ImI&zwo2lU zB5rEz9$vu>$@N#RT*)S<%Xpa?OoD^PNFd^p($J;k(Mgisra4Q$W$TwK%-6AjHeL#x zm(9fN5o|EPjoR&zjb$<|>A4VsMipPNCNPOqnWQ8*P0ijfBcUGmGoVgCw9;6^1T$zp z%~fxc<*BqjbqP6o$<1s6%H~s7mY=ycKRPp1*iK?|8%V$wFa@b4*t?yfY%;2y}AkM-|D+kU2L52F@xkrN!B2-QKniYx& zE3^jtlO;zpFw#{_S&`}KO)ij}OaK^DT7%J@7i1t$ui!G43tl(_36z zs-A*PfJ6S~mdCq_v0_HCd|^Bc^(F=TW2wmw-vVpg1W zne_b+plU`Yj}-(EUr1_iRM4H%-xc7jpZaN2)6L-wy`biOX-;U^tIz{Ud4?Zxm{fSdVzIGZ4qhUb>Cq2X1 z5M(mPn^$*p!aR@lA%(`xlBn79_8RZzW(n$_0o!xT^cHHF?cr^*v_a(?@on|e1j%ld z_cl+(6j{^W#!gI8y8_7OzYwG}XViDIahgk`% z6HbXzMypU>=x`R?ep?e)nh{Vm3}hjKjOOHDl^*he$?XYdetf9A`AD^~)mj{L0(4 zVM6&+gGu?3{j)9pkJ#JR$#aVKTPp~p$t7_Fu!WB<@N#ELT*f!52`D-{9=w1o<|mqt zcH87H$M&^FT_c}wTpY_C&9OQTu6%b_Y~K_zSkNV^)s&xMe>j}H{wp)gEnkt6hKA3D zE0%4${p1)*ar#BAak@ctugGcLD$&&Icap_A20AH@zOrx%NN!Z=db|w|jv92BUT+n~XnjWuXj8eDBfzX-Z|{jWiDBnG`vV2gQw^;KyrDO<_I$3e#v4$eqq z>`qVO%Cg#b=8}=Q46*P65%xhk!x5IY0=XBFIi}OfE>;6~Z|V>ZoZnD%6mT$?%{s~U zoYFLrZvUqC$9L-ZXcjXeo)5vSJ?#dNq(w>AY{drQ?+AqE=Da; z%+uP&CROf!dCJxF0U9ikIEe?M&muHtX50p0I&9VBsV=B#r12 zsND91|<9Qpq&E^}K<*tc=!mx+)V$bjU z8*FL4!OK$UQ<`-+`^UZo&rc-&WML0tdbb)I&AI72zA?MCI3`>^_(_l^V1uHcLFqTI z%ih7LBV=lMJY9cYCRi0&{n##nGV_lxoGCeXHpJ~p((Hwzd-69KBd)OZ^>_qFwuDU_(PYX&v9YX4L+Ak%@Ox9oYewpl<>H zJ}X{FXzaIpQ%bS>Yzw({No-`r;n&Qe3ag!I7Q?=BtIeBnQl7~Um=nm+S!N_oTde$ki0IWI`?{m%Ip%}cZsqjkE%FE+KFO?Q3 zye1%~*&iuGp_F|=xETrx0yjQIPi@gIh(eJO{0dv=@#<&K1Lb=pyE#b0O7Sdj@miV2 z*Qw3v4$>sKCr?R~h{1+WtLL1T9)D+DZG-E!BMtbR+<4qh>-vvc6FtiIFAi8Xa;IZrg zYThN6P1Non57q-P)N(vXu&-%WM@ z>SL%(euB;~eH1(J_ji&}M>LkcG^aT$&D)PNZ43$}emMPtHv?3{NA5pkn1)Vcc7LZ< zRP4VZ1~}a|ZUz5&+nLWkgaro)aNL3m4S_^?ql@?M-JWY-VI!QgZ^@PO@~YZlyDF9= z+9c+8o{-SQ)U@-X;B>rI`ZG`7}tzW+jTay-GPt4PB+LWWjz_ z45<%ctaF~8BlQLao;pZ$Xr>7wkvGVoO9oO;*4+#cN^EXAAEle6m|X@30Ub3})%(3G zS?U>muU}r<-o>%vJBlq4;EhMT4|y;TuL~&cw3z#5MZo1BUiIr$N-L1)C6wN;YB`np zCdA0e={MBXz>H>NWBdI1vkwzjeD`0pi|A^+=KVSUdcQ5GL)7=C?0wOIdtFG$f=f}^-hcA8twDML~ zR^sm+LCE{HjrpOXB2hmMM0%1!6hQWxkr4)l`oxhtYy=%0$^&;36BFQ-NIqb(RGtT2 z(Bd@6ZMIE;#}G^Z&ol3AOw_!F8t4dUx+KO)sPjjo*#Y>QA3l5l%~iP(aF9QC z!0fWO5suYblH0U$~HQ8aTN_Zu{-B9dlMs*EKf6%;Luz;J4lY zTCtWV*ZcSHL2=plAP1(~c3kwsM_SR?5 zdoE!xLT>9f`*9U+8=1Qu;9GJi|EF5L0BTgrtf2T7hYvYSjn@!!3xrUqICqeBy2 zU8>O(67sIYS?iI^oE+BOi;nnINwm;a!wi^;pFLZ+FwTh+qi~s;w@>l?yLW4{RQ&&J zU%VUtE&V3!iKpG!8uw=Wl8Jr&`ZZ{b_7-JtC-*x7 zNGOz5CLzwiy0L7J#f6g}(-ZxY>((ORmjwh6ZhmEJi{_dJfbSYN7gs3fqwC5)fBa~2 z+1S{S%&`y_7EZe)cpk{oJQ^J0?nap@f|TTg@Dg-8$RQ0A{ep8erhf=jS)QCEq2rgv z|MaJnOB4s#YEXo=piI3w?>neYRQ~(PTYwK)t09CxU*Nlag z_U044)3G3CJNNj=fzMVY=T4m@&OGV>wx;;ucI|%ehg2xRIWeumnaWvd&5fY!2bmwi zpXw)-i%*Awe0)xyYjYS#kq-PzIs*G2R3Po%;E9m)Tp0e8UYA!=5_Mks0ik=x6cpUX z6ij}pbvGFq?I{DC>jTIwaCs#q{9|WkSux}Pk|I{V8EgI#k7%x#mx3< zT?~(9Ur6lnd(uTD3cGMW7kqJG;gd0wF+(^GP@y+)9069>>HKck>nB#zK zOrARov)9x#a)*I zKca|pC;<&zs*NmfSnBye*us&4uva~P0a;ihJlR$>?SqK>@3(nz{ng+BQuRPQJKNG3 zBUJco@_R=|NB*zM1^O@E``e+X0zWcc{TTxkFOmx9<)0xrLm;8Zl4918oXy5DV$dR< zNlU#rKt8Va;b6}FEWfULa+q5haX=?&)S|r ziUOIT{~S=Oj@b-UhRJ3d5wO95#6j@HnV;s-Lg^Xpe&o;(af>w6Di}k&7WLZKd4K=f zizZM~!jA#%IiPzs`WTAt(u@qtrRt~s;ELclOd@)B75;~t>X%)3u}$1Xribc-2mRIh z@2|3SU?5EjHG+!Su9R_aUfla|3F-N#0`n|P8(^HLo(FaX8@NCU&UHAhEL9F*r?5#I zBj|Vw0Z6QtDJd?mv{a`uAtIs(R0zKoI$De&Cni>8nPi}3rKBiNg1NNH^qZ1}Xmmx4D8CSN#EEAKQcJL4tj>W%PM_H zYb2=SbJiOHDSL*uN&Fpy&T(GIce{!T-qAx)MDYRHf@fqs3cZqd9NSdvwFY-^en5ha zwmUf38sC?6z{{t={gEH;-RROD1P(sY;PCM9PC0A~kJ*LpPuD%wbGW|jQqa<_u3@1B z*q0`>sd|!uI-W+87Nn%SvIF^5v$WtRM-a<-q@mG2Cu)S9?Emub5ITL8ii+!+tF^Cp z6KK6lB((;6K{@sbWYsfi*{E`Bzw2@Qe3v*0wP*(-i7zBjR%ia#mO6v1>rM60Rh}~;sOFqEoM48(T{z}cE~dIJ6a*>(|iYIe-BaS2Pl;G zHDNb)MSv{>je9OinBue8l0#$Hc{oOOHKl($Riq)sS2P7aYg(oZ9o00`fxXpF$;!`!*9By3JC}ZbXeZKef!wTic7YCcsR$aPxYwb zFq{@zjJ@66?Usp&iFZ^yb$-su$|Qy*8iTS>!GW}V5L%VCUx@O7cmZ$S#wn|HCa2B3S}u*qW%;wpS&+Ac}hVIicILa zrvsrx8vSae{`|SHw-ci!0zx5YPYD#q4IFGBSOlvENRd!H_fhipi|^7m%6WTiuf0r} zYDcj-T!MmfE+hqISmBxvG40v2FKDzo=r$wE9p6CYA4<>p#Pei*qUMypx+71uw=N+4 z$*;ibfCGVcf_dq=PkE><9L|3VsLZO)>f(2+c6Pvt47y=3SaLp&EB{%^6%BGJmJ)BoGIu^%&kz5+VA08de6qnP!f4H}Ltu5(j?>j8x%ph}oP2O>?Q+=F8 z53=sgQC!uSKDn24oqK%PFjLm1{7)Lqp8@SMY%bHY zbBKS1UIExqe@o!ggk4sEmS4Yh7LFlkXF)LK0hO<>FTr@x?BU@d^q&Uc#sR;<+4#R< zr9~I^A~7C?NH*i#o@$aF7>EPkpo#0D(X7|uevI|Y091N zWH6}BR9uTk&Sw6RzRVq$9dvsG02D$H1z+CaN;`B0nFy5`}ZR)ggN=mq&mS_JcjD8&90R|=sed~g-Xf31?1I3vulBNKTXc=zgWV_~ z>||zUrlipK{ZSU;;k{^X166I5{m01I!zYhvhYfUACjAjz@OITxd1P_dL9@4`8*g0;4R0aO| zsW%?~f9U?#dtX&Cp*Lq8isqEP72EEB?9H+UAG>O%_UuqG4KuRZUk$tLzu?Dp!;J_ohH^A zWT2-nj{bp0=1@Br85uztMQ;OL|Eg~$28(SE{cKSAlk?c=MZEDcIeEP{00)bvkdXz~ zz&>C}%1eQ{s}wQ7dKk#;f)>1UC8rJ0Xes6H1u1U6cihvD2tA! zY*kYm$DRF(R;C)Sa4R&V2hu>l)hS+;)5)%U8_4+%8f6GrS3Etye9>D`POc{856{q7 z1UcC2X@&ps4utOV5AQ&S-T&boIA?flB7t@S;K&M05ANT;ucf6WAt3>UjNlwNi?wt> zox)snS?&BW^irh{scupLKg*EC5{C}ndkMnU3NWF8c%#C}af^}hjdyG;Ejh3Up32EJ zL$rkPS%&TUg|BFEN3aC6Kf@z%! zd(uX|gS(X#v-b=?XFx!}oz#D#+l!>+Pbh6IEeN%quCCGI${)~G#0NsIRk3|LSVfZ+7h@6FAgV=QnNzD%EHj#fluBnehV>rw>kB;P0Yl1&~bxCp-MQj3t#_X&2A^?PG;4 zsA!oPKh!{Fv5o^k}|@A!a{I4lscD=PP%?P0c@4%>ZB+S%*)DBt$u+2_TMM)|AFBkTDAB) zb*a~Z&>7&1vLB>O>P>x?NrR2d!O^k);=x9TrDxe+%D^8W7KW;8=D7lxOTsplJpsNi zP!(Tc(=m$51M5aA+oxj{L);SCN+c?9GE`M-P`EeHp6OVBiLICR=he1{c@4Oixk&Ru zS$Za0&6-9c8{-au7oa0_#4tlss|B=1ot2wE;@E|-(>8N?l*+GYc6)nUSiGmNZ^o3T zOJk64m`*bh!I|WRQK$kRL&Z_USXnPGF;5#StAC$snX_5g*a~n8xY_mNhmPL{FO1Lx zERR)_Q0#;A&9pe+NS8(K*L79RS>V6^tY^RoY3gcdkj=A#Bj&AJGGVE=G=CECNe4t)MO?fieOy}mPmcKj2GX3 zq`l{=F~~>(tXZeZ{Cw5Z!vK!DtLh)aI(PmB>-x zSyJ9c8JibtvEdB?8%a@!~~$uuv@j1|FeMk7kvqsi`fPfW4o7XndtF{kkf` zva<&;27=DXseLIkkJV>sD8DyvDq#}bbrTqV;bCEYxf6h?dyFnA4x8E&laK^_&NW!Y zUp!W;oJ3Wetya?@HO-~M9v5ww7j#J(n9Y=|yTffFH( z+S3BGa%JbvFD=sbKQM?F<1yJ_S+%Vi2J;PV&8|O<2u~gx5h8;9{X>tf2tPCZV@1fb z^WRp4p3EHD7xDb9ouN2c* zSXjs(To-hhfhj{rH3{;Mr0zqiu7j`yqTOY9xroj zToFD0r#trzpXhar5X0CbXv5oGaQ!$Kvs2qr_2DoWYZ9>4F)mL5mN|voT_|!oe3P=U zfKBaoTW4b-R{e+8ZV@GRJy3z<4%BVMtQ>h4?(%hwjI_;1X3LQy5Av!HxqVB9^fzJ0LhB_-N3zJm}_>r&C%T|g%6Sn35 z#)?p)?hh+M_*QahDD{_X6lq2Xa&0-eb7uq`W*(5nf;sse?G?Z$w&6>0xHSW$)!1gu zBW6&ma_}I0yOnm?g9%|biRmAGj5?SKS zDSdQKjsotTv6jAQOTB4}e+rb1JmcUOh|Bl56^s_+<}PrUm9H{TQ87Aw$$@1VEMC6B!@P_vFg4)muEZS_KizY$)k<3(U^h9i6OppH^PeB9#$9tLZ^SjxZnWhLx??ayaO&E7WLqo7HgOZb2Zg-zO|_sO?|=9ZKC+A-Dqxc05OZ1x`M@$OO}qg^%^*59wk4iVSABE}}HLqN~x(5yb zUOikQBKoSTp^=eSE?=IOz6icWKHTdI-3&MraM%){2`kh^fT~A>spxXHQq_am8^pfZ z4W!L5gid`g=cZ=-Q?$OT5j+{kNnil8$bis5d{*aE-Pw2uHdf;I|GjNw|2NiyVYev~ z|J!e9hC$0eeVRKG!OeAy zg0$qi)Fant~K zPn?2MKcbkLmbPN8aIC7KosjJC`H?)liTP+IM_ReiJT&OBTue3Xk zZWYGO$82HmeY9LDOT<>)O>Ez*x=YLP=I{va|X?--70NEJ38+5pY}v*&skGmwpzV-vW<9V=kBH z>od*JI1Zo?d)>0M?a^GOz(-+nJJ$;BUO%`S@$&PZK}p@?7lz6QO8+%GEBIFfUx1usy-Iy^k)8\BY(!c(o4?$}T<}?8t%x)A-zL#_VHG!@stcxs7=UdT! zjsN}}d>Ad3juPJ*aXes{(00CJcH<1mwYOubfZ~tJBV-wqxB(}bT-k#J&jmL33o9cM z81&!I`0GI_GF(Vee?`nL+$x}E+wa+jbU9iBhcfWUXdRW?gN=tlO8sB{hTyY_Z*!-q zsj0a~G0^74h5IL*c@zasKeUAw?Ewih+ar#hm3u4Ya&Zk#da0=M|MjDeA9&+of{38! zt`qRmvNdz=+K!Y0kssV}G!;Go;5P0-XsuZE=jZ?Us(*do|3Chq4}ocCMdG&)%}OraQE&K=dHev{Kx@%S@K8cjuz*s>_Q!c$D!~Ad4Vf5 zmQbt$VaD3djRA(kwtifo#@+T?W#Z>fc*zn?9bDX5f5u&TbgUf z?m&5v@V&lZ{pXXsgKD#s7;2ful#FYD)9#V$>$T)g>;*8-L(Fnx#qR8LrEM@2OQ`|^ zVIvN0#$|E(3-1kfMqS1ywYLYoq@rqrfgAPSzzxdkoRG92yE_L+v~vsf1O=aW;A1UI4l z1EGZ099&}&bd{gvgr&K>9u7muhPQi34-C8N7Z8SCJ88I(T_Z@K1D4Q+s(5dDKrB$T zQ-l3xy@uQuG2=pkau%VX3J~#K{!c*N+c)51fCj3o< z1~DU|sQkd;pFc6*M*Ya}?6*_!?y!)z4DiBtE=hO>-jnop_}))Tb^#D8-&Us;b?3$8 zdI7;Iq*~N*?p-|($JsUOgECC?=H@2QccRBnz#Q{8$M$rv_ynn;Cr-02vor~ClH(iD zC8+s-=oDHIvl*D0n*Kh7XjmM@w#jgbUh6QBv+68^?i#{*z^RDEYX_RXw{4@>oq@+oXaO1T!(bkx-^b3#N~U6QP`%HMLEjnE!pW1B5*fW@s4 z9)ZE@KBw=FUG>3$`S5vjNtXc6O!;g*%{)}d$shQtzC={t-#kFP2ga@UTFn`-Ym;uO z2W8==)H7$KMVt8U0>g7&*WnYV2#9Ee`c9UC(DT{ICaYN^Ld5emv!owPu`S52<|M`E z-Rp>0iPHW4_~_-QUn0s?n}kkVfV?fVeU(m&Rn$bf7=FQ%8~Aa`?S2%u6!d=k6U^W( zyu53{6TZiE87?y@OdlzJsuxPGCyChw%>3vVQ2>grpcA%h2>vjR6I_4ZUS!zaTn-d= zx`-6ra_r|ScV0Hp#M3DTepNYzicZk0>RI#jn{{`;R;EaJClRdA#o7JDk);z7jK%cy zXK&8}2ZfFIygyQ&KAn278_6Jqhe6m?Syf<{B|W6r&c7We;JZG$_J zGD$0y1HsU%9*>h4RR>K59Q8Dn298tf_@gV%Gfkl-i7%Nx7}i|S)6@I(=H+lv;KklC zFh+e+OtBv-O7gi2qT9od0^NOmya01*AeLCm)>UyB79Big?R$Y2sJKs0hyJ4UU4wxe zItd@5Hm%_=Kc+`(zThdCS!kY1S>2W}l64C@gj!1G{*u&8tPwOsO{?-faxJk{?TO-v z=dsxj(D8YAi(X|XG4A)_9(X$l6A2i#?`bWXE?bNEFYvvFUIv%|i+ZHKX?*|41Uj8E;^)veEtIqA~(P=9X(cP z`A}*RVk#u9l05=CtMvTuJfY3kYFgjXiYJ7<>Osq6@POwypaxu}MJKH(ry`>~+q0^J z#u8RmPlH@E`;d;7BMx=)VQLN^NGXiihJ#_0s{AwmbXf#zU#)KRdic6P12m5PZ&KFM zoON$^*J$1Na-S#7DHqjia2x?tyZvazJai&Xf;93J{=IKh@84vi%Jt~%sk#AXUB0Z1 zI3CKngagv=hC!eI+2loDU{7cw`0mLEXCQJRzEpxKUCTKy#t&4E>Nt!%=Zh@tc2wd> zxyyzAxGEBf^#}bsYqb}0-?va3q~~B6Q{z(Yn4PP*Q@3XwamphERH0nG}CPPCI8dDhqXlH9VT~13sQ?{EvC4hFDyFrJ@Fz_{@SZ!nv%Or@Gqd zNJ<{sgvljEvKCwdLXA%Tcjw?82$fa@ta&JKUZUJ`gR97vWrrOt`Z87J2IHRGGkYu^ zOHEtOG6SK!A#d#Ys_33Q7$k1n*S)>i&05*cArNZddLuP|NL`;qg}l!o#=2Q^@YvaC zMe}iDDaM_5qRu=`ZLnMbfhz925|Z*%!}MegDGI6hbFygOt&&md3bm zC68d--GS-Fw*&30E0|89$xtJbFnh&(zQf!_9^>-hbT79MT+ zz@-EvG!h>5#O2v(sM_`N-sE;8wDK~}p8er>~|(a1G#JL4o8>Nk4!2puPbaTpPGA=nM~pp@?lZv=<{+&4dnY(6v4 zF9bh!Xg*1<-X%KaJyquqkkNPTjSg7Bdg9TiZ7^{PemFt~Qd{!Gk3b5p+<;Cmq%-Kc zD0NAF@6+)rQ8a4@P-gFn&cZf`e>m8it6$~-E1(s2^MDj50#218sy*ME;f=+P#ClaQ z(=+aE!q{4;^wmj*)EdL&S~77);2m0&@~CPY+M??6)iMQy(O zpV4DGHaQ$F4xv=}Fd3G$Gr7_!+;XCrhW6(*gicsPLj&|DCJ%zB`2u9I?nw!%JG5Tg zlh6|ao*R_VDQreu3nXX$gZNi=mwrj#;e;0F8l+Vb3baoNu|?Ukam~rnAK1mP)o(&?l1xFtclVS}-4 zO4W*Qz=H*S%$hb=|V z@0e_=$OXRZsh(tXJPReUd6p*)P4O^b5J)argRX_?0!EzL<2%_Wt#0c(oTZfBQ_GeK&Ft>QWE4i*pUxRAs3}`fJ5#d^g@M==$H|84@L6xP4J!sOBXkjWpKL^&+%J=16i$I%gorR^W zhl=U$NxFt}+ui-Aj6EbsYzkhhfNcv6fB0;3wxlo7q`h7Dd;y7wN!?kWcuI;f;-P$9 zaXH3G)%rVeV>MOIB01&r70X{IR(9UetzNJG{-W+`La}gV`;9l<8=DNCP~3hZt||Q9 zyX7@h`P`*uT4`}$^67B{(J2j8`wQQ8 zZi6;|`qBG{dHd#<8>mo5`(d{pNP0e5wyyWEGcwF;tskeCV}c%OL)5Ail3g+rKfch8 znIK5hec6QQsaa{aq(!Bm!X3_)gf@=S*NE4Lm z#a1PoR2EByrsSa{eA#_dHP4SlqmNu4>Vyz!fB$r*&GrVL zb8hDCKFH`|u{7Xm&Sxl+*zD=kH8!^L)VpSLYICA^WH>AqMgy{}Je%x}uJ!I1A+fEM zBPa{!#B#W99@u~S^tjYK(9BfH&};XUIyz0c zf@d<@gW!i;PVHlM(U{B$NAvfeK5LtHC36nN=%&egOYA!g+-Y!DZ>(NCHwc%TQSwPY z^k)I4n>)wM1TF(*de$|8_lfHjlX=I|y$~;*sx_j6kPN@)zo7tgO=D$#;LQ12{jGto zz5PtnMT78W_8y(dqeRcm7?*i2(}ub#b0*lw%Bx!VgSExQ#SX80&Sc2+**~IKxZ;R6 z_`<8TgUH*D_Wbp$bQl8!XA`tl<8Ho4qjzLmvcGlq$Aj<^`0Wx7T4XN23p;pxd2r_w zHmZ!F4uU;kDe+EYDB?-uTHhf>mK=kXHGko9OHc0T8pF!1vFE^Td~>@XwBFtihw=SJz)%lDTgPRZ|NjsX}>3v48JW>!bl@I(e8O+=m z*R#Z?U)nW?p60TDc*}lKeZ2Ra4JS73Gspg!MZv15&30nWTGrz9ek2j&rlN9l(v_Nx ztYxs>Owg%o4NrzVu~o1RZFa`TvB0`;=AGkMWo)CsinUsyMZpaqVzTq|ZM?n(Z=J2b zqTt#!c=tb(W9l=p0R5SV_*2g{->_}+5sB@l=J37v!hgD1>56Z18ex@9@{|@-C452!cYMs={?NUyEWn&7O2l?=zVO^Pxrt84t<%@Ttq+VLAwE7Et6ofD zQ~#v%tlT(U)QCy=X0Go4gRx?a+il(Ky0h<0(2*>0IkoD+74=JL|JhO%+si=l0=9d7e*vJ260ReNLv~ow!c5 zSEY*{2~)IyPa|b9QCJqbEkU$ANvSx-^%g^tbZNv(Dym*hoppdsui)K{Y!+*vs0x%% zD(B$5lhBC~JDE=sY%t&AxzZ3(Y}2(wB>#%~PHXe)Mbm!VS^%;iNf5w9{>p3s15Rq| z$m)>a86Fs+T&YX01_mB5D005=+XtKuZ{QgbmkK{pL6oq|u>bl>oJEwt*S0`X%*y=C zfxM|88@pz{&B5Yh+VGB<`{3N~)|jL>azmGWQ#d}|;=Shbmc0y3vJ}`gg_xvF@5#n~ zruCuS`CdpWoY+5lX0vwvjU`#467#D`iT+9K5>=p8ATSE@T{DdZZs|=}HTXq!-+FC7 z-H?ozspIt>XA(0)-;Z>c>GRD&*bNu6`^CKRGx5u}Q`8=`TF!lGlYh!Rl<$2GYVgVN zxp>ou3xLkzE^joq&JWV(8u{^~4W`021HE6ZdQ-INtG`(0tJc0xr+S0FfAI{BVX(;$ z_Zo7K{1{=W?_<02`w!l{7fi|x)Ge{4S-qfcGpJ(>5%jUQLfo`+?qsVLze^I@_S@SU zv+kaTZ0uXT?WM&h9vpC?g;;&Oh2L0R!bL;l;BPL3f)K+!C_c&Sh*u?=51PBiWSKkx z;5B=_oqrs36*0)!If^AeXM;Q1mwBdUS-zvS2fHmp#?WfUJvW^NWXml8ao!Ihf0KWW zc#f)7l~dvtkSQWkP{U_uce;1_kO@K7b`oVR!x5f}4QK9rU%5)mPxPA4YlmuO=OoA1 z5)J@GlrGek=3;a5st5xcJ7YXQX_y!JP2%IJrz_gCo)}*k!0UPMF^ZRKn#^arAT;&c zb8z{gS`Yt<9KJU8r-8gQ0rF&Y@kkf+W%do9(!kHkv(SQjHRJ0 zzx3}bM*}J7pvnEsabxgJmFo>0#gJa zjMx+Gw!VD(Ylow4-5E<$_$AN&IPK*MN>!GdA3r~6e#5y}LGgBp0D#Ci&sR=-&Ae_} zC^!#$kR3 z`IR~HBf)AQPVd6d2+po*r=grka(LtWS?=ZHu>_5G4O}uKe4bFkZVDFR6LDOX1&N49 zOXlQ{uk2b&Uj$j-@m(`Plz;nRjPlnjeU|*3wnNZG+@&8G*KY?szt%dSly(X)KYU|Y zbX6Z*N*WuH-~QeEzu0@vs3_ZQTM#p%h$2Bif=Whmj*1|NWI>P&k|}b|SwKZmkepPI zoHIod2@;E3NA2gKvS)^qT1cY5STh z&oyuq4}EhC<$J;pb-s)0bTx$r zm{u53mL(ccc$nBz-@F8%Ku)|_#G*T%vQv$|hcP^wMWF`nF+ndIKkLtV>G8u5yEk63 zRp_3b|8qS~z~}}6*Q!_mgD8Wp(V*6oh-Rj?h`VX?MXdtr?RYT*qpa=`q-?1xi^Ge<; zJSM;;w!I%ZNE-J-ci=EnV%L>Ohk5JYym`^~_V(tEI!5r@LQMc`z&+xYc#)Hj|Q!2&SP%H}rpzqpEuT2M(T9cAyk&%q*58Ij>;VO-+e47~; z4RJJ=q$8Se<{-JVxKW10gN-D5V<1O1v`sFPezkvC5ZgSG#y*1u=52TvRVvWvCi|n zVG?WN=p84_2K+&FF^?O%zL0f51+m!14U=15tw)@a*LVA&#BU;}@p${Oo}OvSCkci_ zL8&&eW5HF&6mdAa9`e))%b6J0UOJy(?#9oH?)BSkGb;@PPLC~k7tyd2q{nMqm*dc{ zxz&*2yC17^Z5*1TcXXPvF==x;HSPn#T%&TxR$N>Dn;N?2ZvK~d#!y6E(b_uYw+!2> zEcJ&?debJr^Sa z8PyY=VDYtWfzMi@R6n#eof5-KUFIH!GiA4zuywQIWnv=L2d#E*SiYcGZWI!eU0!&` z8XDTXorn?^uYDbS$Z6Q}FwUBXYJ4qXK7dBmA((^z5zqN6H;2Z?mcI0wkmxsCeY-q% z2u%OjH_%r-(yvh!qxGDAI*v38!&D_pe6iDJ(^VxdQSshk|30zHa=Xji1AUL_xv{Fd znV2U!U8;qp8F{0%m|q@Pgx96LUPXEUtNhGO#hrgjdJ=!YNsN;2H>sZI``sk~?Jl7P zLMN<%O@F|;qJ)Wz0Ebcvz7Bl)jZDnBF4f0m@x-_d6$LHCXD%{y42^Wy4}oDMV+82O ze`}w7tvXnFTi9LbX*g000>0J9F~X~t-`{NDqxEWE6s!Ki393>t+@=akgavUbgWSti zx)AHVFojJ=I)VRGsYe!O!6d3ZR%PnIB8a4CpC~i0SXN1DiszBK)KnQU)7opUrVTpv z!QCk!*p$V-xk0m9y>aE?J*_1Y-lSQj!4&qr1!yKa~2yLpMTUKOO-SZ;@(Fpn_LR;CBqHOPwSCC@<&3F$>T)`Dz^=J%V4aK8e*8nFntJ)F z=w*k4JwpK%Z{I4ea*4Xe_>_6ac4iYS!c(}HbT^f;prc*P;r+4f#Q1y7PaiY zvPF&5d=YwqLmlWD`)R@pbL|%^(D94zz0Eyz^*_pg{Z3mkXq!le8P4i#Zb0L=+*Bco zEaA(_GtC+TFkzvtLZS055gckZ44e*}<7zsqooR?t7EL9JQOXIlL98%~J~`IVzo z0KTsLb~vwBP~>fYdGy^TYw-Skj4jnA(>fZ>4CVND5;_Ft`Et`56-d+}ccqoS3pBNL z-hC75A{TIJaxyk0bPM-k7G}&l!DNzqf&LE1mltj_%(>}#!QVH&JROAQ3Yw25UZ z^iDM&jVM>@xOQi6)*CGyb=~^fRv56;Ibq5Bf~z>^hWxPmlO4w8SjR!zmgEJ`yuJrUiINK{hQMc*VW3g!T8oM(cSYL z7C5rmd0Nwkq>ml5OV}E*8v?P)-~w)Mq5S9=_mw9FDReiEX*rGCK8!6Q{k|>%e(9_j zRVM-BSr9s@gK_jq z2d60tnPhFki4*ab7PspoJjqxd{?d{~p;Z#Dx3F3Jt)$SXF+itEI$QlXA;d&+sYe5^ zkW}GEwz=YtTP&stQBQUB;GFFxsY>kHUNogM0(2~y^+&M_FLcbQoK7O*Y8&jsE3=YHR)Y>eLp zd=PrMm?;yh-7DoafN5^ZEkCpT+MFb23!VH`FUew`WMZN&HNLH~@6eeS`e>bm{!;bu zmQzi*a09lic=}c-O?+LT{RmP4WmZ&f6T)LDb@r|lAc|t;s+Vd86OW9Fui>!~yEL>z zAT8)~Btqb8_~lmjkL*(i2)qE~pLL=GrC9*F=eItMoZ zzo=X1WP^tT%Lhsa)`W|hm;}c*-hy#$9cg9*uOL4mx;r<%yc5QDJ*Q{Yrx#z4+G)%? z<1~Jy!O7*)w1Mc#-s5NbE0^LVbmqtIv443%{NwSLUzAG}!3K$4X}@-U^)$%x zjTcPHJ&s*6a6Q$Qng=kPh#KcqUI2t*w7i8?v%ryIK)#TDvL|KJ{;E^RQaEwJpD9!0 z{mK~^sr)H6?A8)irk=KTS6gR3RC34&j*@aBK4F67$PxmA=?7=>J+#&-SWNAF6egbP{wl5N;^^jp2 z#1sZK{k)ihkqRmSrsc#bOnt?d#MIFO%t|^0J9@DP_Vg8Jp5-)M@uTzxIgIU>JNC3A zx}e%1UpJpB6Vbb`_$CWe(5k&pT_vOJ?i$sK3a;9}X{UPbTQLy!tQ8daKU`cf{@%wC zF-7{2A$!d#vndY>onbcx3F9RqC3MOSB?SUh&2?U#M2ur}mbdu{RLWEcuUbgumqalA zS!taM{w|e8GWq=GA+etWgX@sHE6)TPO+v9}KbQt4@oqjC3m4T5k?-R5H#^BO@?*su z>zNp=#z2MbG=Ogn=~t&@t+vMt3>w;UVclc(2`Vq*B%gcRs}tT(WSlkR`H{H$%(X*!^t z8&V5pAxXtCS2>94Cb34_945*-H#DAtyY|`7CNXE8ARZRAvW$-= z2&Y`LJeG#F3MLkW`AqQ|VpQFQE)zOjI%kmC{S-jahD%EHPKhI#RUQoL-{#Z?1;2Dh z)&uQnVIr!Ah0p6}(3h$*|q|U*1EwAT5#ygkxBoyk-UWR<#6)oT!x_jEC3Y4k+`4V6Guz3!ity;TH`9a zO|#?8`*f#WW>IJsol3(YD$;H=-!Sg=mZ!R8nha@ec@@?9)gfQgeruO#vJePRw3YGR zoz32mxxfP@I|oavse8F^aY`9>cc z?!rbcbLX(QORSih_1f#Gh_I_wP1l$CNMnPfe})h%i;SViP^ zJ<%@IP5}U*>0-XZHo*FUN?%^%dkLNIL#75?yeuh>i-ogF5k+OBrk7GQmDHhb$Yq|m zdVk#Xa{kcG3SzC+Gfy(H`ULs{cfLxrwaz|TyEYdp0@)+Uc~j}@^`e$wP~1lF`U4T< z64Hv~V(lfaYdQcnZk_!M4Mefz7vD)sV$PbJs*JCx_tW`i+rZ%x%)0fZe8Qb&v3?(f zYK)vZlSJbOh()ckU?AMD<&asN*^a_|tf`-EE(fs%pyYj~oK#CiDmc74)84Qn29YyA%G(WFan+GXyHNjITF^=hA6R(X0a`rR#e*cmca02{7~D6q_M&k1ojXV6=YmSlO{&%1HaTXA0<*WOeQ_zII%J#tE% z>n|H??U{6~HW%zq%~@_TFJa^88LMk|Vwon6Kl6$~m;Nb`F7f;gYGd27^3uo3XepA` z?z2$bB`%Hy#L<93+?ES??g#A#6gme#Oujpp+4tMqS2?K)IK4#Gt*Qz-E%rZu)wy4x zl?XXRSd(z=1-^Rxg^Qmz$I|0YH@+%Nx1n(^z0bIKq#ymvQu}4W8$!v)H|8jpr~Nr1 zMbGlvy=ey5S4w$@$KGuY`q;DljIjIYnM&IE@Glvdo(QFobc(f?j4`&F_D>iZ)oBD2 zNlU|naxYCf#Q?4HPOKCKBA>z?q%qVYcDi2h>U6C#lt5v&v9~47WN{=Wl}7de+*kFmbs_gjHD{ z>L}uMaSiW6t^!+m;51pOyKc~0=w~+f2bLmaxSnQ@n;YBTaUhKy#@TL?RXrDjr|d$@ z(%kVx$Hh*RORJvmW5|oBGT6}Ntd(K*q14~Wq~3j*Re-c!pxB`*JktgoSn8Uh($WEZ zzq`F}y{>ZOGUbP7dW>o*d+gpAcYpnoDmoR;^2(H&_IJ&~skZUMn55Hmu{`GCTD#kM%-eT~6>n9#{(in>B^v(L3h!*` za+#(h;`Mq>Xoys#8aFnYSbC)%cr{5ffEsuMt>WLs$P%Bv&3klRxnm!fs#o2B^Skgp z@~6RyQlU4lRdrmI>XrAudS>(i&W1`-g7H&wt^gv@PAjgl?}4Y=PZ^%m-^_O*jEo$g{9F3!Is_3%^LAmWvO>f$__I~Q|O9F z&q1jscII5$Nt`S)D7g-dJK!l1_SjzA)l|)WAVzUpz(xF+7_($ZTz?q z?J`R}`>P|REbj9|XD;j>-8O&A##(Lsf$f)il62YO;q$QO0@B-1T#~WpmlHTBkm1u|iT`@eIIMq<9l-YRkS?kX8vvv)pKzGpVtZv2t)33lx z{>;p?7e?{O$o_TVqR_#eJ?Azx=4q3%jp5o(wj^njA`ZANv!(HJ8@+Nn3|DUZ49P

?A;s>a3B-RC!P2|1$2H- zcFCmo{Vw_NR@juVB&sN6;7OZmVFf|-tHw6@Z?g~_lLV{&iMhwq_b)E~t;^4#H0A!9 zG~JD699oBPdz~fkJo=IK`3p$##m2l_9ZG!V$BAqB2jqJgSUEaQeKQ#vg58I!kA)-V z1?&d5DI4mVlC$*vNS=SouiaZb*xe8eHO(c(d3mOmJfKPEqcWjSU{sSsWqwiwvK~k# z95}yiT7H^RmVxFILvJ>gFT~HEE@yJdk^0d`pavWQdG}|KtB~V3As?CI=1N`72fPS58nZcS=Ao=JVCR3@14#HlW0GZ=O7BLa%b;1ydmoWsqc%cny59boxiQItIj|)-6SGEyZIy_f z_;wTg7TmA1wMQb^kTkzaejNb2({}QWv!f%njAc%-T(k}_Ox#gsz$b~RBpaBkYF4Qm zz^VlL)+l+yrup`RkiGLDPdrFk=~nUXnD7D78O7$i>oaNAO~4$jsB~|uN7KzfF=9gM zM~c)N=4EEpU1McdXo*mM;QH?I-PnGt4t46-W4L*LxSFry6a|nCtk7%XehEkOMxR-+ z5d`l3VxE8gMJ)PJ3BV1k~5o^&W(@6@d3>pb>Nl3YA~+ z27+@|k%9iQ|J{Xl_PQlNHufrOzWe~L=a{A&u>B@ZVPH(s-Xqy0FaAe6M-eipvsIFy z@!ID_nxI%-gxV(V#j}E|wfok+X`IH-fm029{9b4YVJT}VOd}I=G=o-TP{B*h!aXIU z4gJ%f=P%;jLJ~u{q2N3)Lvm-T3~wFN1#*Bg-LlC|h{-_?Nu_k=Cy>g4%*DJ@{T|Lc zkirBzsn4q4EnC@N=LPgTLHh^Y+4^1>!3NasI)bkS`m zoqz9#3T%1x_f0~CrjF0F@P%K;GqyHzaI`lzu)=<4^V|YYfQN>I2K$YO2p+qto2@Yo z`!mJo7RH8-c&>l+SuBWMgWgp!p73ZUd7hH5Z+MI*xB6BSlS+pVc6x2txe%gc=$MI{(j-H zOPO0b8r##bOIaE?8cP}*+87z*iHYI;<2qeOmg`U9L+9+bk4Kp{SNK2vTNCa7|Ly-K zBQVvDvtck|^nWq^1paOM{c~>c{@3S*m5RJg0tZp`28-ueCTGig6t-c?@r(m=-LJGp z1F8_tnIuEL=KC86*O>0C2j70W{dA+gp@0gENDs`<#Z86iK%98D4h zbb7suy`g01ta>ZzGcqXJwTWMwKL8ME|p~b^KyvL zl8`TY&yc=-(|_yyoUQ9cdEB52S&U=q&P!E=zkl|2u6Vbfj^V<%Xw_c$%EhL%U^jnG5PByPx^U3C}pA zPLeunhJfkGG~&OV5&XRWHY5H^8xj|Xz`r&mKFb+>U*jhE|o&E^WS%LKmcGl~`CB z^|{n32c5~7yEU;GGH{KlU-nYK!`wlGtHhB{)}B;`Tv0t|<9ph^j_Hrbx5vp#!}|#h zmBUn9`))lgBbXd7LEAl%`C*S->9GGzR{VeeGO1EVMn+!Of!5KMA(lEAy2uM+Y`}Q> zvpE7Z*LWRL5)${m<2U>h9Q&`{zMZQ99f!3gV2q@AgYQz|q+sufu!D#&w9XE#@&g5Q z(=ZUolNekb9i6bPQqUzyb8vMPSOxJ6UsKSCB1d>ZQK-%#^+ZU`)maHEz3!^1{8xe9 zbpe`=K&}B$S%4L+LBAxQeG*zO=E5o~E9)H23qh-{S1!NetP1w`*QiR*V{g$n8sco; zQ=uR#en>F2({VqO7yG7RnkU!xl>(HuL2x$?M}Nuj*eb|=LA2VL2g-2?TF?y%sum!jssIr_^j=H% zljGprZ}EVh!Sm`6W~LRElsJO?1Ywzsr}3LoP;sK|Z*6O{Co?94R)0`e>0Jf^P>@qi zqCQsxBuNOx!2tj&S9j}mlS|K&iGaD)?FMM|d8|(|_UJ?dd@U6kBjxX@198LRzdpWs0Mq!_$>MhJ+ zPR?C~9NjUF&)!pEhkHt9S|!&2bOX_1NeFyYqZ%X)cb}1ER1In(ZnK>{$xp4Ch>$mL zoPszBjj=L`!(tZwj2Hp<465g!zz+{G5BTsRE7{631pe;sF7$W13|m2WK>e#t>?hzT z!qB|}i3E*R-S$Oj5@zkI*L zb7|YrDQ{lChR5|-QL)^l`(y~riqPad*6&-|$ig2=N9Y?PRxZm@WyiDqI1AJ0+A@E8E7!E zIPMbL@6bRDLF*vjV@5M_V@6h^mWcqug%jz(B$|mqa$pwCdsbFy21L#z!JLD?LMMTq zws`A(omyuQF=n&L_GldG#=)VP(Lmrs8$R-eBN&C1v1^t*+Sghwg5r3b`AsgP5WdBY ztmfq71JiL;5YlEP{AS(`HOZVrojiOiuZj9l{T9TQ^6=YE!E1MhSiP^Cbj&jMeQa4njj@G z_awBU*!8TJmx|E(HTfxBFA6iTiy7xD_Yckia_DNFx^>3Ov-JFKzLRv>(aF;dxl|41 z4-_RP?n>EjfM{~GfQiG#3P$slP8ReC)9BJ5&@0%&ycT;aj=d5boTO`EDQNx97trX4 zZDa=FF2G8Tu9TjavDC?XXLhnD-C|qQY=9h5bjUk3kDG>7+r>!{kX1-=l0t)yh2hJz9)^&E>i zFlwxp7N;Sh$YpG02D9PE>3rUhmX=3W)%LTmoIEk^_{7A-z2WcQ8`Q`)j<^BlfeO$- zI%ZbQ(^Q9|vDd#ULn~8x%Jufpif1va?UY`wxVjuJ0m16ZN)!nU3|YPY`10I(DVTPk z7iB>Ggy(S->xG~89&h=GiIOpy?G@*x1UX0_F^YL0%KGc&*Vb zePFs8>kxcHij+WCD$$3N&B(Q13-NIbM}#Tp3T2ky)KH-zY=v*B?^*Nhsw3Wsm)Ny^ zV1lMta$buMyy}UZhCyjxU<@ggwsnWSB+va?$s(LjW_T zAxd`f8~)J%Xv(T2%AVDvX1VHfh|xRKb$`4DE%c_qeYIxiSw40OR($LyZ-f$V!q#aQ z^ARA|E-l?*FIR{*5{&^zp3y{FKJ0|Txq6rC$1|Ky;4?c96zn3C6j&xEDsUL_C4tM` zuBHnmh-)AYZ$2wAd!L$sc6{1ShDQ3btxVCmBQvV(B3$8uV|2Y>^e#6vGO`o&g%5Aj3A4g#+8}T}DRRN7J#?X3(68bvfusef zPSe?xPBfsqyKUz>$cB4X-&wYKU-CFN>j;0FBuR)1I*@a7bB}S*^AYDjwi;B-^78X{ zybnP0|6)1q*D8Sd6cq_bZ^EzF56r6|;orYsRIc1ij5rlWUfz?5AkN1|^!Md8#|?XY zE*#sU+}q2&0Pj>P{eAswUuOe<;0FpES2fV(ABzS7*?;>JCB!q3{u=xbZBBv%zOvjfd-Th7^$Y_Oyt3c%$U~Q9|#WK2Px)0SZL0Dl4Cxx%+)feZd%-f9UF4lk| zZfU?wQe11>JrH;62m_VUV&3HO3cHll&+SKpuuSM$2R>nKE8jqzg%yLCYcf#*oeWTq z*?M7A3hy=)I&>MX7H5sfsOUkFvN=B`j~_A6&G%?)TTSs>pTbErJTc+@tNT{p^BOPI zOmQ=y&$k70mf!oBx1vz=*`VasXmT?6IKJsL5xm}qTk3sS{fZT-K2I2&N*Ydu6X{ug>8bq0=GXygeH<>k9;eg`1 z!*>$E1%AZ3NnI^2D%x8Dc?R*m;qCi1+{qxZ9l=WB6x{_L97FqOYQc6IN-bM4a1*WZ zp#2-t8PbWpSsb5_D=RCY)(Et)xJ?c^x>srE4awRJvXqiD1<$Sld5&=r>{u*2Zo zx0J#TJcN(8i$*}5jy<`L;JKJAJv@>Bctb^bwd{7>BapJFllVXX$XzWyA;GfUq7IZv zikbNN%hS`pSI;=-)65mqM+{W@v>L%oDI_Z-!+yrpOO!=!JZ-*qm2<9nw9lh&h@-<* zpJU;W-6a5QTd=?Sk9$y*R;w~@C(){+O$P(ph6$Ky zsk6|lj?$T*pO0t-O~jLV2`k+iY;JQF=#Hl>;8XwcBK89Q*L|&3JpJc~`!Zbk8^>c` zxQa`a^3Naqzxgj!IRyM{r<2@47B0+i2r{0{%}v+ia{9v>>gwWRZX%)AU|V7HR-5gG z;^6S$B^b=L;0<1A-Lcsb)86C~_UL}SgH@gU^by2C!LFn=oDO#E8z6&1diGXEUfxQL zF6hCp>@_wwKMn@nE~H#v?eWomOiT=L^S40F1yoqT+5}iAiNKkIDG9W#+g`A;CpJdl z+2-tytLu63LI}aVL}62HaNf}r0!TQ~=Q!vU68=El#-ru7v2;;YNC@u2U7t@Ny#`hv z;tSd;DjN{P9)Q$E6bc;C;y}O~HZ+l&&ISh4($ro&_A@QsH6R!UyHKThA3Y)&ZgZCd z%c(cI-^VH<;t7X1x}A9H98A770MtF?Xa+|{$fuOV&guMsl@9D?6tw53)-{R{zJEOB z4hyobnET}$K)e~m2O+(Y@1|4EII9X-PIV8A&mRd8+K#g!AApHjc^6k4!k&T2UdVBT zgki*Ea3mB|3#6|&!q9T6)qd(R*CZ%4R*5XZDS@oMvQCI72Ui*wbxRQP_S3rx*K$lm zLSm7M70kUKw9fQvab-mpytCVXJ$nAAuDBSw<0~#9%Pfb#jRMr^J2?rF0uNsr+E>XW zmlw`)fmJC7bwo>N=K$&Bm8cINRF#x$p%_qIT@AUal(aNdqbB>wuTTn=Z2P>S5*?}L z;NwG=Ex@K^asX7K_RV211#oiOg2yr8`V+qo8{rAJeE2|g$fhF&1+{IEsF!Ppnx0kS z>sPNj(yrZ+YUx1_4q`};AoD2|Wm;94ho$a7k(5xCfHisniwQ&PR}mzY{bhHC!Jt6* zD-0{&PWa^=(A>Cj&%3%bJ(CwYIYIvoq{T8pvUK9g##et^vGJ)s?BneUfxdeK7#$4_ zje2sW2htcF62E?OPdsGcSwZsUM|b-N2A*nZv1kkhQ+rpK_1wljXhEp=Rl%&qYDYOc zgFKWSXd>+S1}OINz+2*oD?fee1pTlZU-K(p1Zkdm5K)Sz5j*;0sC6M4;_IEw%5@OX zs7z2&->Zlh{sp15!wy}gvaYVR5Ljd(QkI(JcxbE2=-rD_*4AvzeV}rba|v21;_L7RI6uhBHbOnU{>mDkoG4{(vvj` z`rVh33K|}VwjYbCbuk^732ULadv7klsw|TV|9i)5dbK(Hd9E9WKEA`TVqZK~#ss}z z!~LvX_>7$Z*}Nfe7$D#tf}pY!CpEgfqq|s)UIT@-*ehl1AUfTb*e7}jM?zXpSzH6q z=H#oEP;usQQuFvWb9o||nHn)ClC&f6i8sgG)hHlK|MW?m0*)*za=Mq>t2GEKE8aU$ z=Ia?*Ie9dOH*Q=dyAic&~x7^zOqFF(LK5le3|UWq=M)mIS-}7#1OsQ1h`T?v$EUa@e6tR zl2c+~XBRdC9@g;qQ6Wr?8x#iDQ#K%u8y8LgNI(P%I1eb~@vt3l(ly0YGT}#1t4|hm zXcklt7jA8BMG*$&lb(fCBd)Pg5)moB`kNz7F?l644h$8o#w~wq*&^@>ABEoF{MLvy zwDB9nuC`aNU(2^6oiK7{%-237WS)%z5hn326u6|y57FCw@G%`U&8-)rHc?P`>!f3; zqHj-zHGP3_GxyXdxf1|IXd{=AnVE|+q4a#`DwJjyPj1lh0Gcm_%i|jPZr^^OCAU3q z&zBCHPuwSvxiiq*{QU-wK&^SmYe!@E`ZeYw+d4Z-ozRONVBRGstD+KNz6c15|nApPu6f&*gRaH{ltvVR_McH>|}{Q&QxmG3Jl z4wQ>afQ5nL3Jg3X^J0e0mv#I`_Ga?)XP&qm=fi>di&JY*fIFDMMq5_)wRvV%7Lc@) zv4Ie(H%=lF6C?=O2Cc&wW?=>S;)`elLu(8B=(h(-q*3h8ZO zpdvLVjqyDFnA&x)GO1%SWyX;$;&C6Szm>uiPpv8`=^vi77mm|)Ly#D4=e-mP>tuASZZ7G5H23+I`XLhSwanr?Q!DGWV$OKSbFUz!GT#Qb zKBGjok6@>2T4aw>w`}piic|HOg4SGFZmlltqF0~r9B`CBeHQCMsEsDaYMp4A?)uaW zmskuiYu*2m!wZtO=t|SXsY3AkxEbX1`>snVn1SU9g>ByS!b7U%Jo{wV4;qGf0WAhG z?2EH?gxAi6Ct3{^+yWtRf1box3)r0YhVnr;aLv{ko)FuGXiTPJx{yuo`^3eK0xA@C z7`o^!@NtdHGy^Jtgqvn#Y;26ulmMS-^H@M2e1x~O5sZwC99nek4%U=dWP*60As*Od z#SY*No)lsp3&nnOJ7;i#s2Ki`kgy~@4>?b7NrJEo3SF6!(%{_Wc*|VyJ@o4%Ox-Gn zH>ZQCg?2`Vhk^B|Iy&y9_ID(N9OLd_dR5yba_cpfIGCB4JJKNeMx<0}HUQiu%Fd|2 z^7~x_chCEqt#E!U`lQHSe&l=W-eQpwwntMWN9^^qO81vbGVR%xiI6~<@wvsww5Bh~ zDeOYnw0GG*ckH8LDw0+?+u?qtKz{Z5tAzsn`qQQt-jtO7poxyVhpo;2xz>5Qj@vX> z9L%reR8k?N9kRd&VtbEU)WK-xyk>xc%1u(z`7sOYGsP<-IV8*LigKp6y7${0?Peg~ z*O~|Wxnd@6?!GV-()QD`xT~^`jujuPMp0wQuf7Y|P^`*^FhP&2#pBb)nIUvQ+4_P% z#7W&f-QC@$fm6o0^|L8&S1j;*dnfOxh%vIWug;(I;(7Y{PFKSsvpR&2dzY6$*3&X^ z3pbI#{cL!md5s}R4xzWNdI z^n4&69Kg`~AaA>g;>_j!mALUhcjY^FGcQ3-ds=@kF7C4gY`9m-!6wFN619cj`T0D$ z(#I*Y5xZ)Iz%LGE$;`&Kol-mzxDD<6LqecS)n(ELO0rgi-Nu9#Y1az8%pURWp|xyc z)t3QVYM9+RxyyGyrlVYwE%0G_nW(_bv9hfjM68vfr$m$e*}QQ3%PbZ`NioKwGA2tmb zbm^54m+turZH%NojfpKi2h;$iGBtqRj>}A zyR%eh9_1uh(Y`H)z>lLeH8s^5?Npf$pVe4B+sAa;67jVCRJ^m7$mAXalUw$d>=Eey zMrR@w6YuTLT<^2id7!-hzk}5^ZHess#Jmlr1m!~8qpOJT-Rf!rd1(;OG#yc58#Q( zoR4Yl^m}9Z3*5s!fFp0ZRORJa1G2FmpU)yNx~YNqwX(uj6?F}4Kjl)`LJA;|*7E7( z2j4@v*Ehr5+JL{&i`J;K*Q>Pu0$O%fk=ALyA^P)nK|j0!2=zxrBgi4+njr_!5QdOD zj$#HJVq_mbmc#Dmo@y~tl}963X(f|)*oK9^`dImWM{z(her4q64@QQ7su{LycAdja z)kMcf9D#3g&7fvfoSNsI6lQ9#JCJ+rlBALaKCDlu1*IKEs_Rzz9M_s-TN}AZ@D)v6a~qpvE@P|^tr^vickiA}J~kD+ zcrsWC2{#E;V&-ON?OtBJEz(a;r9tw|0Zk5-x7^wDoguu!e37T4g_n!bG~lI?Dj^@k zA^4fDZ-xtOkV5&PTIOGf?)i|NIAA$5b*YAFNo%zsic7rH1$!@M?HDW;#aT%PwUYX zgMtmujlRLhzwEIwO$$=j*X>#FEK=DU_c1oAM!@m!Zl~*W=)`>{y^8Erbr-7kXb+M5 zyd#cRc6En?fOm7(HB%OO5G$;C^F5N32`4iM{NaqjCVE{zpGkB+DJ{8|e6!kzG zU#O7~V`c5E+=AH1Wc(C^+o!5ZcQ^%m=6^0fu(zmrEysWlprgxhWoD*~gPeRfGYiW& zT`>rsmpMnNt^nTd=R$&&p(D`_ocXpQ1XH?RP5PaRbCkZwAi4HL_59!50X!Ooz*_Tj z=Frd(*ik`>3_1SBy-kn`l&mt(od%8H!Pe+dhzg!DIPn<01076@pk-`bnA9OFBV!08ui*JOy*ckfn+`hg-C*%X zE3sPa;Nh_ka(IyVx3{(Vk5+=WE0hd)dAC4@tZX({Co3oC;8XNvsxfo4=SebQ1SQs5 zHw8JllB2K*WP+b0C6u|&ojaFwrx5h|WJ2trr8F%q?W%|hY|s}IAjWoC9&rMFgi(o` z-Vrg}_wM-s4lwDt@#bI$T3gVutUlC5(ry7WR#{-q7EO)44fvX3iRNbpXW!VOZvg9- z_NZY?7=OR6LZt#uMR!fj5h(bm|nv-9(1Yel3&c%BM-_DLc zKYLPRICZ(7vN{OURZUeNjtFm6?AT1rdx#PVCTw?CK6LRP$`wep6TNqybuf7mb zkP;ECp>Fr9s;k$HA^@xbJPryo!60@YZ9W1KHq6_NAFtVAf7A0G1yc|(9p;7=2B`z^ zU+?dMJ#h_&7nA&w)&*L#u$nbwd_qDamw%xE7)z6nH@Wh&bWnE`km%sMPn>D1Cw%PPr(gh;D7|%fF6iFZ`g5K4i&&* z>j0{fem+@$u)8~aP#=;2(tcnnFqIkG0(9-(y59p2g4Vy4{z$>24g1o7{ z1<0`BRYM@fKS1ax3SkZXuW&}y(^>7&>>E;Wa*2F`QEC9=k^RcwVLk@|{G_*L>y8ie zkG!d{1$tP-eU_jgP`*(L6#0eu2~Z469R*F!eK%IH8@)UMuJjP`ng#>3Q6=&o4s&7Q z<5_ke*!o>gg6HB5=mLc!?(~wjhDPQr4qdE`w|8xO0|>i@wPR@5FRrOHrDI{#sboOu z5UF;+llY6m{rdF`gjG*aIIVdw&U2s+ai4xcs!b&!F78+Lk>%Dczj6xrJDE+QeryH! zU3|Qjy1F`8RC(-JJ0J?D7S0C}A#x3-zWSl6dd3FGp$NA5BD@Mbn3mx=Ai*pd$h%Og zgiM^RZX*qiB{e@i#(as^XIBho!zQ4N4Y1BHUAhFwguE;i- z8)vi1kqoF1;1Up-W;{{tR)&IB2#)w50%|sCEftj%y+2Tm*(hU}^;#So8ZxtPh;WE5 zu7!OFreLSMUGMGa5ocQ9)aUAc1?X&GCp;_!O8FoLZRh9t_jyBo8|r$4_l|ytK{k9I zasMJ^u8YRmCx|t8_JlKLC3gyyz!n`!`f6%u&^IlxoqnKYYChQZ0s{NS1Z=y7f&v>m zyN$WIAjJIOvb}ZcLUNjU5q2A1lY6dnk#tuSfQdheqCKTSKnoC}-jY{TN~ z68Ee1E$D#KoU};s5=1(127q6e<^`XPPJa0^Gb^xst6N9_%Q_cR8Cm&#pQ85kCUV0w(Q2DvShKnUD^djKG@ChTE>4pIXF8v1I_<~ye&wHzC2c=sk zRwi|k06HwZz#3!TjJ&_8x!L)M{#`@g*Sl)Y|-tyE))>stLtznH4!mUqZvKotH;`m4Vw!$)vvH1`?NL65_r>Us z<$WM{G*p-9p2uq@UhifY&*8QMJ{KiT<9%nJ12M^j@ledC-v3jyO@z>ntQDDB(zpZ$9|ykJMBMf#18m@e86TTegO_%J|D`>p?BUknrFt%nGFptu2! z2c7l)3sX6~X5Re9ZA5=zoS#0wn@CZ)^o5jgzLdo*b|?j2)$E7Wp0SV2j?`yT+Fko&Hvau)zm+s&H@urCsv61Sk?*XGm*e-Gu zkla}uEF0(q=!8}xOCZQ02P&OFN!bM9;Zb$)3@qNgOHd3SH~m}^F|9@p;lW5bX8Czg zE8OkTh**Jx#?+^{Riudqys|A_h-(4RYXv~>SnO}CL6eHjx{fu#dA@680=udIMe!h- zMIRB$#>hh*w#=9IYw4$BsM(JCH+)54b12YVFK$+bUM=sPaTlj6GQYtle7lmp3)p>q z$V-(m(0}Bo%|n2Y&`NfKYWmloYyJJ?5TQ?a?^|7e<_+UlU5v{MXx2e<9Mj>88A+Wy z3w&h8ZxBH`JOOdgTtHV!&;q8VWm#ECNlO?U2~=tih5>)HTG4Y*{Mb%*W=7xg65F4C zjlp6*{y)Pp^2pOokW=w`Bm(ZGK&qsplgq6kIM1>@r_yv%*?jOSh!6bZEanq!>RK~Qy~@L z$q@}*Yd$ZqQSgR?*!?7J?lym$BdfwhPW)GfPBJpqtlxz)4`UzB3HYjxL?w`KAs~r8}v4X<(e}?ivG*Py=wL$K&SneQAx+iR|i7UNo*4+$G zbiy;9dbnrsnwjMiU#KiQV1+R_+oPg{Fhd~v7~C`n$Zxf4Ag+Rz>j06AcLY!j6Q!PU z@&0o^*jq?P<(j-djmhh@=wZ7qmH6WRDobC}dtf12CPFc)GYzO~`#l<@Y@o7|k-Gsg zO=lW}iSZ;}d-uiY#=Ih?4#Ml(6^Z|%_>_>QVYnZx6``lrm8D%u&zmo-zGtG=hG&{g zy4j=p@~sF#cDD8|k>dh)0l-LQDEC0r1mu2!Bh5sa*?yzyWpZW*v&qZ^Q4 zdVUpqUlJctrE3-lgy>k&^FPW6uqx#U4{s!W!i@rNc!-j7=!4P?-3o;HG#-vSv zQ&o%Xeo9Tq2XX-@nKv<-ZE2o@CJ&^oq#~hNiI~@v7_{l?kyS87r>nuHG4}#UN_KPB z&=6McT#H3m%F|}-3na0QG!P!B9F3-i%bvs&^`0#FG(8OL7!gN=jm_Z5t4Wp=xXG_` zFY|c8-5rDX94b}S@wmo-?fw9rcSm&ufV+dvpvN>j#Wo62hM99hY^=Ezc%GHDuIH%W zED)V43zfSvRD$`ri$1Heu7-&|fV#g^Y}S6iThHf*Y2(Pi&ofQy$~12PY%gkW5VoE2 zE!By(FPIw<^ZJias1loC~1tz$(Q; zlU$siHaPuYnpai-{8o7fkXmFytiuajb|@#*^IiWSycd|?&MOS7Y$})miRJ+B)9ciz z7|x#s0l^ide-7O;c)^xuvg2Sihxgfa1MXpE@I!=iWEd=34`B~~OqKo*Fb^gMfO(rM z>^fHs0AnaSinB3+x+aMr)y9WgFfQ#!&b}cgfX{6p<_2o?EN)} zBgla&V6!|1v)lR|1&97uSeWvLPoG8_)ut)u)lL5MAqxQ{=}7kj7M2*4YR_IKkJK2{ z>(&g=P%qrf;6kA?nCDt&k2CxL8S1d{`QhJNCpwFvsMA@QH{8+LdXS{dz3wjEGEGS6T2tZYfm)kb(M$YHY-_U}>!lM$5bfnrI z)*~e{**AQ4+x0fR2{x1hCJAyOtM1xMdhBXKW&5r`je>4$K+OILz>j}-=_VCzgsofr z|3TY($79|9f5VD2l{AE+LM0(tWhOF`RElg3BZpD;2uYO6p4m!98D)=*woS;)%HFcK z`}xs%e#dp)-}`qxuKV}%$9Z--RmX9B-sAOpt{2sq^!G+D)HmvuRYF|_3nN0kb6fw3U-3wl245^1@r77J*?96Y_qmy*G^ZY{aJ#cxgkE%86W}lmoJd z`mFB3Q!n)fjot?j);1q@4NXa<==}C~`R#{U(ZNNkmkt`n|>lRYG5r z>O7=_J_?cvr_ivWFY`rru=bQkTZ9l&8^iI-KlM(MH-YdJi1d>r$SZdi`+#O)DgOz) z3RNfCW%9Bw3D}qQ7#WQ=VUV6`MCL*s`fT(>vaYRqy;xlcPLq3~tg1@?vjPVS&Oyw| zvC`BGH?koGLbN)cta&19SbUjmZWX;Q@|i8Uw>{5dP-$?bFR_$PruHtF#|Ve{Vlx&? zsJ%B{0xmr;FmPtKNwcHIH$=KqNieGS1)kTV8GRjH%@lp52_{X z47`pXi6^+$IH9XaOPjFIWw)fC?)rGJA+B$zFBXM7FJ8Xv3Bm>i?_Uj+Fs)k38madUyl`!U6PACld_SDQ_eP2VXyX>lP8C!5T4j zhXpeat$V|)FsSR`Qn!og-7b2VzpI*hAqBJ7bcgm?O!R73&hyMNXPnW$OJVeRa03wV zp+bH1+jov^T&EKXS)ZFFkx)WQtE{EPq+RujB?Xojb?DWep`_EldX>FZl{?x-`q&FRcV;*aKsz?$T8S(<( zJub;_*w8*Cj1hlWje6Ugu6Amv_y(X+M!^8$KeU z>09hMJ3G4w=Ga%SR^P!|Wt{zMz5*IqOGgYTtcYaAh+2TUUXgN804}2z&zPn=Wy%sy zG{0xsAVfjl+yAB{vsFG{b@|a1pg1J0)c1XVS-u&=mm)e|R97_NWr*_z%b7xb|_?oEh5NW)GN1j$NUcz1{e5K;c7E_#<&9go?XlZFN zQy#EIlRJtoLp$SidjJE7NFPE53Y1WzNb*s)ZY+8kM zTyg1S{m`4_WC1Qe3PlEs%gntL7kjvEc>I0k;8NkuOHZN!GuL*ztjq#y10-xYJ*JoIO!qy&XZwL$4o0<@eYh(h&KkT9dV03l`A@`IWeWE zgA2Xqt{WMx5rAn=H7K&uOCo2i_n>vOl9I=a`3&QXS&l4tTD~gNXB|SCzcp=bf(e|`C&DOB zV8%5-vlw)(l7UZ6u}I)@bAYuAMO~6=dP&;DM~_HxRV^)vW=AY$#_e>&VaC1Y8~{*h z{(j*f9ne z`1nTr`_l6A!b%@pyqU;Xp#OFAlsaToritU-o$Ibmc+oC#J zR(BA<0n=1cwBg_2nU$HyoGJ-cq=&jAI27yT(HPZ<+00Q`!&hbQ5Gz{HTVU4f!4^zx zW31%w-&i3p51T`;wSf)dVxE3LWI4>v*bk|O(JC&|%mM_CL6M@CypnyP>| zr>=VhF=QF-km>FUc|0#ZXUv-B$D%=ypQLzrZeZ`<+VPU3!`$NbLkZCX3gC9 z=~9*k{N>MizI`GrP3?`fEiLv?S?Aq#F(p6q1Scydoy7I4Vd`~i9r!VlD#nZb*|VsU z%an@8L)9>}R7d{o)Yu{U;I5&)Oiam#e~rt$Pv+~}LFL1c6Y4En&x@Mog;5PS!tz5_ z+olGWoOGKy_B*^rj%{FWCoC}!Xx2}NS^d{KH{X-rByXTjd%fb$$-RGj;9}h1EkaB+ z@pk^0E1lBG%6S0tx{BcG+QsXcnINRvYyRBo^FBA#F!GWw#5mOR#DKPwLi*7JI##=? zr6|<#dVk}~CgLW6Tm(Y07`AE@`-bVR0I&Om<(7wWr-{e1dElZj=F40W&7v`7y6g6+*i{xWKKq~^ zTHIarRaM%{-V3&Bg`P)rR-3U#Moz9XnP>=J zqyGK5t#hV6wXhR+eN}5~(ngQgQ!(k3_I{^}N|zc4^`4SIXr;V4^gr^rp_}wasCD)w5OY*Teg7(pT-N2pepBhaTgnS$Bv@ zJ~wk0t@=R_e<@g zUntA}$jk)1y;(jo)@5w%Qar17d+WAsnf`|8af*Dbr4CQj>+w&K44T6j=Mxu@sB6?I zBakY4X6?VBUfitrVI@+DxRMB4fXeyb%zY`tMX{<9?eU=jPlPE7Is3^AI0MJ<;tLg% zBO|q8t*gtHN>kzdH?>vB06=*8D_NUm$v9}q5GA}Le*z13!n_YMC2q;rINj-nRBI&iK zYV_^i*Lp?{c9^8qMRIrw`4ieS(Qg<9{{Ny)%ewLpZQ9x}dyO+~Dyh2esH7C{0}fFw zD5hAx167~rvCSQ)NQ6G6&eU1=bMU^9b55m1X-5qqQ*{&)*dw3EEQ+666{-0>{BG{$ z_zd{~CjDGoTvl6?_J!WMH=F_SRAtR}A(Q87aQ7wsI&72ld`n5or*R+Efm5(buA5P0 zMMy#Ek;QOwvN4Kc$X%0T_>u^4+P!YvDZMW8Lr}hDRiroox(ttm!F7*OZDy&zZ`dXt z&j3^1_Iim&c*s`Bh6zGOMlHcps zA2_QgA&AVq(yjAFr&U#h%?6=ol76Ib!vSZyh!Qo2U%1yMA}Ut>({qu>5DjOyCvfh*J=|FN%^^R zKKe&7IE0z`C8$1BkBbpYz6zq>ja5mrP~=*Sv)+UcZ9d4^M;v?jjXi{{4@}dbf#D&q z`|)Gli3rMzu9xXgq-u8bhGAQp%`oVsWSn$~KcV(ay3#4fP5 zuYX6{ zbO6UV*w;c&ivZ_72#uhX^jTp_Ov*Ki-{5MeCN)c z)*hcB=Cney2>*e#L&CUcH7vI1kns$m=GWuz^ARawU@=jIJ*Qw1FLU)v3-xMYV_YgT z#3qV=Lr|;#XFDC;9L_DY-C<&m`Pgpo9)5hdF(&4?)rdMt;h#FJLipFewInR>ICTCk zht%mm$p^$AaZ*QD*SO*Bb-W~4n73@(#sp!X-Dn<~H4I^GY7&gC2d0;8y!zMuH~oWp z(?6`^g}%=JT_Upo3zb@-Z(Mfv+}$lO{?;`%_UrU|cx>$-t(n!_%Bk18bbSx_+5G&^ zA#XYcHv(}IR>ASeNUCDp&MZ!ghs^Vf?F}C;A;QTCm~NzCd5Wu5{h=RlNGm(yw$+FT z3fj}}6&ToiLcSULNwA-q_kMpt(96tVJ=d{ghQRkg%4=eJAcT+63YRZe=W4PaIr6sW z|3F}t@*Q9o>;sUMXlVNQ!)ntE%%@utvq#tQB