Research News
ChatGPT’s responses to people’s healthcare-related queries are nearly indistinguishable from those provided by humans, new study reveals
ChatGPT’s responses to people’s healthcare-related queries are nearly indistinguishable from those provided by humans, a new study from NYU Tandon School of Engineering and Grossman School of Medicine reveals, suggesting the potential for chatbots to be effective allies to healthcare providers’ communications with patients.
An NYU research team presented 392 people aged 18 and above with ten patient questions and responses, with half of the responses generated by a human healthcare provider and the other half by ChatGPT.
Participants were asked to identify the source of each response and rate their trust in the ChatGPT responses using a 5-point scale from completely untrustworthy to completely trustworthy.
The study found people have limited ability to distinguish between chatbot and human-generated responses. On average, participants correctly identified chatbot responses 65.5% of the time and provider responses 65.1% of the time, with ranges of 49.0% to 85.7% for different questions. Results remained consistent no matter the demographic categories of the respondents.
The study found participants mildly trust chatbots’ responses overall (3.4 average score), with lower trust when the health-related complexity of the task in question was higher. Logistical questions (e.g. scheduling appointments, insurance questions) had the highest trust rating (3.94 average score), followed by preventative care (e.g. vaccines, cancer screenings, 3.52 average score). Diagnostic and treatment advice had the lowest trust ratings (scores 2.90 and 2.89, respectively).
According to the researchers, the study highlights the possibility that chatbots can assist in patient-provider communication particularly related to administrative tasks and common chronic disease management. Further research is needed, however, around chatbots' taking on more clinical roles. Providers should remain cautious and exercise critical judgment when curating chatbot-generated advice due to the limitations and potential biases of AI models.
The study, "Putting ChatGPT’s Medical Advice to the (Turing) Test: Survey Study," is published in JMIR Medical Education. The research team consists of NYU Tandon Professor Oded Nov, NYU Grossman medical student Nina Singh and Grossman Professor Devin M. Mann.
New mathematical model optimizes modular vehicle fleet routes
Researchers at NYU Tandon School of Engineering’s C2SMART Center have developed an algorithm to plan the most efficient routes for modular vehicle (MV) fleets — specially-designed vehicles that attach and detach from one another as they move people around cities — removing a significant obstacle to making this type of transportation system a reality.
In a paper published in Transportation Research Part C: Emerging Technologies, the researchers employ a mathematical model called MILP (Mixed Integer Linear Programming) to optimize the service time for the passengers and the travel cost for the vehicles in an MV system. The model factors in passenger pickups and deliveries, en-route transfers, and variable capacity of the MVs to identify the best routes and schedules for the attachments and separations of the vehicles.
Conventional mass transit and demand-responsive transportation systems can face challenges accommodating fluctuations in traveler demand, leading to long travel times, energy inefficiencies, traffic congestion and financial waste.
Low-capacity vehicles like vans may be slow and overcrowded in peak times. High-capacity vehicles like buses may be largely unoccupied when demand is low. On-demand services like paratransit often deliver only one passenger at a time, making them expensive to operate.
MVs offer a flexible and efficient alternative. The independent vehicles in MV fleets can connect while in motion, creating platoons that travel as one unit until the vehicles detach. According to research lead Joseph Chow, Institute Associate Professor in the Department of Civil & Urban Engineering and the Deputy Director of C2SMART, MVs can move people faster, with less energy consumption and operational expenses than many conventional systems.
“MVs offer a promising alternative to move people more efficiently in certain situations,” said Chow, who collaborated on the research with NYU Tandon Ph.D. student Zhexi Fu. “Imagine, for instance, employees at the same company. The individual vehicles could pick up people who live within similar enclaves, and join together in a platoon to deliver the entire group to its workplace. MVs also have significant potential to improve on-demand transportation that delivers people door-to-door, including those that serve people with disabilities.”
Currently, no city has an MV system in use, although Next Transportation Systems is piloting a MV test in Dubai now. According to Chow, the inability to track and route MV fleets has been a significant roadblock to potential real-world adoption. To build its routing model, the C2SMART team used the Anaheim network, a traffic simulation of Anaheim, California.
The research on MV routing is the latest in a long series of studies Chow has conducted around urban mobility. Among his previous studies include examinations of Dial-a-Ride programs, e-scooter usage, and urban bus networks. Chow’s new research also advances the mission of C2SMART, a U.S. Department of Transportation (US DOT) Tier 1 University Transportation Center (UTC) designated to address the US DOT priority area of Congestion Reduction.
Fostering innovation by connecting engineering and medical students
A new paper from researchers at NYU Tandon School of Engineering and the NYU Grossman School of Medicine explores how interdisciplinary programs connecting medical and engineering education may foster innovation and prepare students in both disciplines for more successful careers.
The paper, published in the Technology and Innovation journal of the National Academy of Inventors, describes initiatives at NYU as a case study, along with similar programs at Johns Hopkins University, Stanford University, Harvard and Massachusetts Institute of Technology.
NYU Tandon and Grossman have partnered on educational programs for about a decade. But for most of this time, the skill-sharing only went in one direction, explained lead author John-Ross Rizzo, a rehabilitation medicine specialist and professor at both schools. “It dawned on us that we spend a ton of time bringing engineers to the medical school, but almost zero time trying to get our doctors immersed in the engineering world,” Rizzo said.
By bringing medical students to the engineering field, as NYU has done in recent years, educators can enable a shared understanding of engineering concepts that contributes to more effective problem-solving, Rizzo and his colleagues argue. Clinicians and engineers are more capable of collaboration if they speak each other’s languages; new innovations that result from these partnerships may be better set up for long-term success.
Rizzo compared this interdisciplinary learning to earning belts in martial arts. A medical student might not become a “black belt in computer science,” but might learn enough for a “yellow belt” — a lower level of understanding, but enough to enable collaboration with the true experts. “We’re creating a smarter generation of students,” Rizzo said.
One way NYU students may gain this expertise is through participation in the NYU HealthTech Transformer Challenge, which pairs engineers and clinicians to work on “healthcare’s most pressing problems.” Finalists from the program have won funding from NYU and other sources to pursue their ideas at new startups. Other challenges and grant-funded research projects at NYU Tandon and Grossman have allowed graduate students to receive co-advising from engineering and medical professors.
The researchers also discussed barriers to setting up these interdisciplinary programs. Early initiatives may require extensive effort, including dedicated advocacy to bring different school administrators on board. It may be especially tough to convince medical school leaders to devote student time to engineering work outside their typical course load. Part of the challenge is a lack of data: while NYU and similar programs have produced some clear success stories among individual students and startups, universities are not tracking their results in a comprehensive manner.
In the new paper, Rizzo and colleagues share lessons from NYU’s leadership in this interdisciplinary space and from programs at other institutions. The findings may provide inspiration for more universities to consider connecting medicine and engineering education. “I think this is a trend we’ll hear more about over the next decade,” Rizzo said.
Better transparency: Introducing contextual transparency for automated decision systems
LinkedIn Recruiter — a search tool used by professional job recruiters to find candidates for open positions — would function better if recruiters knew exactly how LinkedIn generates its search query responses, possible through a framework called “contextual transparency.”
That is what a team of researchers led by NYU Tandon’s Mona Sloane, a Senior Research Scientist at the NYU Center for Responsible AI and a Research Assistant Professor in the Technology, Culture and Society Department, advance in a provocative new study published in Nature Machine Intelligence.
The study is a collaboration with Julia Stoyanovich, Institute Associate Professor of Computer Science and Engineering, Associate Professor of Data Science, and Director of the Center for Responsible AI at New York University, as well as Ian René Solano-Kamaiko, Ph.D. student at Cornell Tech; Aritra Dasgupta, Assistant Professor of Data Science at New Jersey Institute of Technology; and Jun Yuan, Ph.D. Candidate at New Jersey Institute of Technology.
It introduces the concept of contextual transparency, essentially a “nutritional label” that would accompany results delivered by any Automated Decision System (ADS), a computer system or machine that uses algorithms, data, and rules to make decisions without human intervention. The label would lay bare the explicit and hidden criteria — the ingredients and the recipe — within the algorithms or other technological processes the ADS uses in specific situations.
LinkedIn Recruiter is a real-world ADS example — it “decides” which candidates best fit the criteria the recruiter wants — but different professions use ADS tools in different ways. The researchers propose a flexible model of building contextual transparency — the nutritional label — so it is highly specific to the context. To do this, they recommend three “contextual transparency principles” (CTP) as the basis for building contextual transparency, each of which relies on an approach related to an academic discipline.
- CTP 1: Social Science for Stakeholder Specificity: This aims to identify the professionals who rely on a particular ADS system, how exactly they use it, and what information they need to know about the system to do their jobs better. This can be accomplished through surveys or interviews.
- CTP 2: Engineering for ADS Specificity: This aims to understand the technical context of the ADS used by the relevant stakeholders. Different types of ADS operate with different assumptions, mechanisms and technical constraints. This principle requires an understanding of both the input, the data being used in decision-making, and the output, how the decision is being delivered back.
- CTP 3: Design for Transparency- and Outcome-Specificity: This aims to understand the link between process transparency and the specific outcomes the ADS system would ideally deliver. In recruiting, for example, the outcome could be a more diverse pool of candidates facilitated by an explainable ranking model
Researchers looked at how contextual transparency would work with LinkedIn Recruiter, in which recruiters use Boolean searches — AND, OR, NOT written queries — to receive ranked results. Researchers found that recruiters do not blindly trust ADS-derived rankings and typically double-check ranking outputs for accuracy, oftentimes going back and tweaking keywords. Recruiters told researchers that the lack of ADS transparency challenges efforts to recruit for diversity.
To address the transparency needs of recruiters, researchers suggest that the nutritional label of contextual transparency include passive and active factors. Passive factors comprise information that is relevant to the general functioning of the ADS and the professional practice of recruiting in general, while active factors comprise information that is specific to the Boolean search string and therefore changes.
The nutritional label would be inserted into the typical workflow of LinkedIn Recruiter users, providing them information that would allow them to both assess the degree to which the ranked results satisfy the intent of their original search, and to refine the Boolean search string accordingly to generate better results.
To evaluate whether this ADS transparency intervention did achieve the change that can reasonably be expected, researchers suggest using stakeholder interviews about potential change in use and perception of ADS alongside participant diaries documenting professional practice and A/B testing (if possible).
Contextual transparency is an approach that can be used for AI transparency requirements that are mandated in new and forthcoming AI regulation in the US and Europe, such as the NYC Local Law 144 of 2021 or the EU AI Act.
Hybrid Decoders for Marked Point Process Observations and External Influences
Wearable monitoring is likely to play a key role in the future of healthcare. In many cases, wearable devices may monitor our physiological signals that can indicate mental states, such as emotions. The lab of Rose Faghih has been developing a system called MINDWATCH, algorithms and methods for wearable sensors that collect information from electrical signals in the skin to make inferences about mental activity. While their lab has been successful in translating these physiological signals quickly and effectively, they didn't incorporate direct feedback from the individual’s subjective experiences.
Now, the researchers are incorporating feedback and labels from the users, enhanced with machine-learning, and combining it with their existing model for a fuller, more accurate picture of mental states.
At a high level, the human body can be viewed as a complex dynamical system. It is a complex conglomeration of control systems that each works in turn to manage different variables or states. Unfortunately, a number of the states researchers are interested in — particularly the ones that are more abstract — cannot be measured directly. These include states of emotion, cognition and consciousness. Nevertheless, changes in these unobserved states do give rise to corresponding changes in different physiological signals that can be measured more easily. For instance, we may not be able to directly observe or measure a person’s emotional state, but we can measure subtle changes in a person’s heart rate, breathing or sweat secretions (which in turn affects the conductivity of the skin). These signals can then be used to estimate the states we wish to.
However, the signals researchers use to estimate these unobserved states are “spikey” or “pulsatile” in nature. These spikey signals can be used to estimate the various states of the human body and brain without direct observation. With already-existing methods, you could obtain state estimates, but still wouldn’t have any means to have those estimates agree with the more direct state-related information in possession. This is especially important in experiments involving human subjects where the subjects can indeed provide information related to the unobserved state. This type of more-direct state-related information can be called a “label.”
Incorporating direct feedback from users offers information that can’t be gleaned from biological data alone. For instance, a person with PTSD could have their skin conductance continuously monitored to provide an emotion estimate, but ideally, the final estimate should rely both on these signals and information perhaps obtained on rating scales or through regular questionnaires. It is likewise the case for patients with hormone disorders. Hormone measurements do provide valuable information, but should likely be combined with personal feedback (e.g. regarding feelings of energy/lethargy) to obtain a single complete picture. The authors met this need through a proof-of-principle work on a hybrid type of estimator.
Performing estimation on some data such that what is predicted agrees with available labels falls within the domain of supervised machine learning. This work adapted an existing neural network method for state estimation by adding a penalization term for not agreeing with the labels to enable a hybrid estimator. The proposed hybrid estimator was utilized to determine an aspect of emotion tied to changes in skin conductance (through changes in sweat secretions) and to determine energy states within the body based on pulsatile hormone secretions. A wearable monitoring system that incorporates verbal feedback from the user with physiological signals for hybrid estimation can eventually provide a more complete picture of the user to eventually provide more comprehensive closed-loop care.
A wearable dataset for predicting in-class exam performance
Stress has a negative impact on physical health, reduces work productivity, and results in significant annual costs for industries and healthcare. While high stress is known to raise the risk of cardiovascular disease and have negative effects on mental health, It also has key effects on the ability of one to complete tasks by both excessively high or excessively low stress. There has been growing research interest on understanding how real-world stress impacts our body and performance, at work and across life activities
Unfortunately, attempts to simulate their impact in the laboratory or elsewhere are less useful than datasets gathered in real-world circumstances. As a result, researchers have access to fewer real-world stress datasets. Even rarer indeed are such datasets used in longitudinal investigations on the same subjects over time.
Real-world situations are also unrestricted environments. Research-grade equipment is frequently inaccessible, and motion artifact contamination is pervasive. These continue to be some of the biggest barriers to automated emotion decoders outside of the research labs in daily life.
To address the above-mentioned gap, Rose Faghih and her former PhD students Md. Rafiul Amin and Dilranjan Wickramasuriya performed an experiment, in which a set of students' physiological data was gathered over the course of three exams. They used a smartwatch-like wearable device and collected multimodal physiological data. The use of the smartwatch-like wearable device was to provide a seamless data collection experience for the students participating in the experiment.
The investigation shows that it is possible to link the variations in the physiological signals to the exam performance. More details about this study can be found in the corresponding publication titled "A Wearable Exam Stress Dataset for Predicting Grades using Physiological Signals."
To enable other researchers, use this dataset for additional investigations, the research team has made the de-identified data publicly available on the PhysioNet platform. A Wearable Exam Stress Dataset for Predicting Cognitive Performance in Real-World Settings is available at: physionet.org/content/wearable-exam-stress
Ultimately, the researchers believe it would be extremely beneficial to consider how exam performance and the stress that goes along with it interact. It will allow for a wide range of potential applications with the aim of enhancing personal performance. This may, for instance, assist scientists in developing effective interventions to improve each person's performance and increase productivity within a company. Additionally, the knowledge may be used in online and remote learning contexts to connect with students effectively and improve learning outcomes.
EyeScore: Predicting stroke reoccurrence through retinal scans
One in six deaths from cardiovascular disease is due to stroke. Caused by a blood clot in the brain, a stroke can have severe consequences, even minutes after initially occurring. That is what makes prevention so important.
Now, thanks to NYU researchers, including a collaborative effort between NYU Assistant Professor S. Farokh Atashzar, and NYUAD Assistant Professor Farah Shamout, early monitoring may soon be available at an unlikely location: your eye doctor.
Their project, called EyeScore, is developing a technology that uses non-invasive scans of the retina to predict the recurrence of stroke in patients. They use optical coherence tomography — a scan of the back of the retina — and track changes over time. The retina, attached directly to the brain through the optic nerve, can be used as an indicator for changes in the brain itself.
Atashzar and Shamout are currently formulating their hybrid AI model, pinpointing the exact changes that can predict a stroke and recurrence of strokes. The outcome will be able to analyze these images and flag potentially troublesome developments. And since the scans are already in use in optometrist offices, this life-saving technology could be in the hands of medical professionals sooner than expected.
Conspiracy Brokers: Understanding the Monetization of YouTube Conspiracy Theories
In a first-of-its-kind study, Center for Cybersecurity researchers led by Damon McCoy have found that YouTube channels with conspiracy content are fertile ground for predatory advertisers — with conspiracy channels having nearly 11 times the prevalence of likely predatory or deceptive ads when compared to mainstream YouTube channels and being twice as likely to feature non-advertising ways to monetize content, such as donation links for Patreon, GoFundMe and PayPal.
Researchers also discovered that:
- Certain scams were more common. Self-improvement ads, many of them get-rich-quick schemes, were seen more frequently vs. mainstream content. So were lifestyle, health and insurance ads — including two advertisers unique to conspiracy channels that were generating leads for insurance scammers. Ads promoting questionable products were also common, such as a supplement that claimed to cure Type 2 diabetes.
- Affiliate marketing was a constant. Among those marketing low-quality products, for example, almost 95 percent used some form of affiliate marketing.
- Videos with ads got far more views. In the conspiracy channels, monetized videos had almost four times as many views as demonetized ones. Since YouTube’s business model relies on advertising, this may be because its recommender algorithm prioritizes videos that contain ads.
- Content pointed to alternative social media sites. Sites like Gab, Parler and Telegram were mentioned more commonly in conspiracy channels than in mainstream ones; Facebook and Twitter were also frequently referenced.
The study was conducted with support from the National Science Foundation.
Comprehensive study reviews best ways to monitor defects in additive manufacturing
Additive Manufacturing (AM) — commonly known as 3D printing — involves manufacturing processes that depend on a user-defined set of optimized parameters. Monitoring and control of these processes in real-time can help achieve operational stability and repeatability to produce high-quality parts. By applying in-situ monitoring methods to AM procedures, defects in the printed parts can be detected.
In a new review in the Elsevier journal Materials & Design, Nikhil Gupta, professor of mechanical and aerospace engineering and director of the Composite Materials and Mechanics Laboratory at NYU Tandon, and Youssef AbouelNour, a doctoral student under Gupta’s guidance, examine the application of both imaging and acoustic methods for the detection of sub-surface and internal defects.
The imaging methods consist of visual and thermal monitoring techniques, such as optical cameras, infrared (IR) cameras, and X-ray imaging. The data is abundant as numerous studies have been conducted proving the reliability of imaging methods in monitoring the printing process and build area, as well as detecting defects.
Acoustic methods rely on acoustic sensing technologies and signal processing methods to acquire and analyze acoustic signals, respectively. Raw acoustic emission signals can correlate to particular defect mechanisms using methods of feature extraction. In their review, Gupta and AbouelNour discuss processing, representation and analysis of the acquired in-situ data from both imaging and acoustic methods. They also introduce ex-situ testing techniques as methods for verification of results gained from in-situ monitoring data.
Among their revelations:
- In-situ process monitoring methods can create a closed-loop AM process capable of defect correction and control, to ensure process stability and repeatability
- Integration of monitoring methods and machine learning in the AM process can help in continuously evaluating the quality of material deposition and developing intervention methods for correcting the defects in-situ
- And using x-ray Computed Tomography can lead to an in-depth evaluation of defects, as well as an assessment of the quality of in-situ monitoring methods.
- Integration of quality monitoring methods with the manufacturing methods eliminates the requirement to conduct the quality assessment separately, which can save a significant amount of time.
Separately, Gupta this year was honored as a Fellow of ASM International, a global organization of more than 20,000 members. The organization recognized Gupta for “pioneering contributions to the science and technology of lightweight polymer and metal matrix composites” and exceptional dedication to educating the public about scientific discoveries.
The work was supported by the Texas A&M Engineering Experiment Station and the National Science Foundation.
Correlating wavelength dependence in LiMn2O4 cathode photo-accelerated fast charging with deformations in local structure
Electric vehicles are one of the best tools we currently have to combat the effects of climate change. There is one outstanding problem though: it takes a significant amount of time to charge a vehicle. Even fasting-charging units take much longer to “fill up” an EV than do gas pumps to fill a tank. This may seem like a mere annoyance, but for people pressed for time this is no small acceptance hurdle between internal combustion and EVs.
There have been numerous efforts at the material-level to improve charging rates through engineered electrode coatings and nanostructuring of active materials, which limit energy density in the battery pack. Additionally, the study of light interaction with energy storage materials has gained interest for use in photo-rechargeable batteries, and integrated solar energy storage systems, though there have been few studies specifically investigating light interaction with commercially-relevant battery materials. Facilitation of electron movement in a battery material is a key for lowering the resistance to flow of charge. Administration of light to induce light-matter interactions is a possibility to locally alter the electronic nature of the material. In particular, Spinel LiMn2O4 (LMO), a cathode material, was shown to dramatically increase the charging current when exposed to white light during a voltage hold.
Now, a team of researchers led by André D. Taylor, professor of Chemical and Biomolecular Engineering and member of the Sustainable Engineering Initiative, as well as graduated PhD student Jason Lipton and Christopher Johnson of the Argonne National Laboratory, have discovered how different wavelengths of light can change the current in LMO cathodes. The team showed that illuminating with red light results in a higher charging rate compared to both ultraviolet illumination of equal optical power and dark conditions.
The team analyzed the effect of red light in the context of the electronic structure and possible excitations, and showed that Mn d-d electronic transitions occurring under red light illumination are largely responsible for the increased charging rate. They further demonstrated through X-ray absorption spectroscopy methods that LMO Mn-Mn bond distances shorten after d-electron excitation. The shrinkage in the crystal volume beneficially contributes to delithiation kinetics by lowering the resistance to lithium-ion conduction.
The results provide a roadmap for rapid discovery of candidate materials for photo-accelerated fast charging, through the review of calculated density of states data. Besides LMO, there is a wealth of materials to investigate with strong potential for photo-electrochemically induced activity. Once photo-acceleration is optimized for a given materials system, it is possible to envision using a small amount of the source current from a fast-charging station to power thin, flexible LEDs built into the spiral wound 18650 cells in a battery pack, enhancing the fast-charging capabilities in next-generation EVs while minimizing impact to energy density.