Investigating public perception of healthcare services from different perspectives may generate inconsistent results. For example, patient-initiated violence against health workers, and the tension between doctors and patients for their dissatisfaction with the quality of healthcare, were wildly covered in the Chinese media. While patient experience surveys on the national level showed that patients were generally satisfied with both in-patient and out-patient services. Such differences may result from biases rooted in the survey and media coverage; however, the inconsistency also pointed to the need for additional data sources to monitor public opinions on Chinese healthcare services.
It has been suggested that social media might be such a data source. Rozenblum et al. pointed out that when patient-centered healthcare, the internet, and social media were combined, the current relationship between healthcare providers and consumers might face major changes—thus creating a “perfect storm”. Users’ posts on the social media platforms would generate a large volume of real-time data regarding public or private issues, among which healthcare related information scatters. Therefore, the utilization of social media data for healthcare research becomes a dramatically growing field and already covered various medical and healthcare research fields. Sinnenberg and colleagues proposed four ways in which social media data were used in healthcare studies: (1) content analysis, (2) volume surveillance of contents on specific topics, (3) engagement of users with others, and (4) network analysis of users. For the content analysis, most studies focused on measuring public discussion on specific diseases, sentiment analysis for medical interventions (e.g., cancer screening), identifying safety concerns among health consumers, detecting adverse events of health products. Several researchers studied patient experience, based on the comments posted by patients from online health communities in China, but few studies have been conducted to gather information on healthcare services related topics using social media data. Meanwhile, although sentiment analysis has been wildly applied to process user sentiments associated with health-related text, the lexical resource and tools designed for doing health-related sentiment analysis in Chinese language are few and far between.
As such, we selected WeChat and Qzone as the social media platforms to conduct this exploratory study. The objectives of this study are to conduct volume and sentiment analyses base on the extracted social media contents on hospital healthcare services. The study could demonstrate the social media users’ perceptions of hospitals healthcare and may shed light on the further utilization of social media as a data source for healthcare research in China.
2.1. Study Design
This study consisted of three phases. Firstly, we utilized a predefined list of healthcare services categories to devise key words and search strategies accordingly. The data searching strategy would then be used to extract contents from a raw database, which contained publicized posts of WeChat and Qzone. The extracted materials were then put into a corpus. Secondly, we applied natural language processing (NLP) techniques from Tencent NLP platform to the corpus and calculated the volume of content concerning different healthcare services topics. Thirdly, we conducted sentiment analysis to explore the sentiment polarity of Chinese social media users on different healthcare service topics. The detailed process of data collection and analysis is presented in Figure 1. The study protocol was approved by the Ethics Committee of School of Public Health, Peking Union Medical College (71532014) and conducted under the academic collaborative project between Peking Union Medical College and Tencent.
2.2. Data Source
2.3. Healthcare Services Categories
The nine healthcare service categories, used in this study, were derived from the objectives of the National Healthcare Service Improvement Initiative (2015–2017), which was dedicated to improving patient-centered healthcare and patient experience nationwide by the former National Health and Family Planning Commission of P.R. China (NHFPC). The initiative operated under the leadership of the Bureau of Medical Administration of NHFPC, which suggested that we used nine predefined categories to reflect the healthcare services in hospital (see Table 2).
2.4. Searching Strategy and Corpus Construction
In this study, we constructed a healthcare services corpus, in the Chinese language from the social media data source, to enable further analyses. First, we constructed lexica of keywords and terms in accordance with the predefined service topics. For example, the lexicon for “Information technology”, used in this study indicate new information dissemination channels, based on information technology provided by hospital to improve patient experience of service information acquisition. And this lexicon contains six information technology service-related terms, namely, “Weibo”, “WeChat”, “website”, as well as “Self-service machine”. Second, we developed a set of searching strategy to extract the relevant data from the two sources based on the corresponding lexicon of topics. The entire list of search terms for each category and its corresponding searching strategy were provided in Supplementary Table S1. Finally, we applied the search strategies to the database of publicly posted materials to screen for posts related to the healthcare service categories to construct the corpus. The search and screening process were performed by Qcloud.
2.5. Analyze the Social Media Content of Healthcare Services
Based on the healthcare services corpus, we classified the content to different healthcare services topics that predefined and measured the content volume of the topic. Specifically, we used the open application programming interface (OpenAPI) services provided by Tencent NLP to analyze the retrieved contents. It is an open platform for Chinese natural language processing (based on parallel computing and distributed crawling system). Such services enable us to split reviews and blogs into sentences, and each sentence was filtered to classify whether it contained target service topic keywords and terms. If the sentences, containing certain keywords and terms, belonged to the corresponding topic of healthcare services categories as listed in Table S1, then they would be divided into a certain category. By counting the appearances of each service topic keywords in terms of the number of sentences in the corpus, we can aggregate the counts at the topic level and calculate the proportion of different topics from the social media corpus. For the sentiment analysis tool in Chinese, we also select Tencent NLP, as its algorithm was trained by hundreds of billions of entries of internet corpus data in Chinese and with successful application in other Tencent products (https://nlp.qq.com). OpenAPI with function of Chinese batch texts automatic summarization and sentiment analysis of Tencent NLP enable us to categorize the sentences on certain topic in the social media corpus into a sentiment polarity classification (i.e., neutral, positive, and negative). Finally, each sentence was tagged and classified into different sentiment polarity.
3.1. Content Volume
The social media corpus contained approximately 29 million records from WeChat and Qzone, spanning the 9 pre-defined categories, related to hospital healthcare services.
Table 3 presents the content volume of each healthcare services topic by social media channel. Among the social media content on healthcare services topics, patient safety was the most commonly encountered topic, both in WeChat and Qzone. The majority of the content related to patient safety issue, its approximately 8.73 million records and covered 30.1% of the entire corpus. The proportion of contents related to other topics varied in the corpus: Information technology (22.2%), service efficiency (17.9%), service environment (10.3%), inpatient service (9.6%), appointment-booking service (3.4%), nursing service (2.5%), doctor-patient relationship (2.5%), and humanistic care (1.5%).
3.2. Sentiment Analysis
The results of the sentiment analysis of contents from the corpus found that, in all nine healthcare services topics, 36.1% of the contents in the corpus have been recognized to reveal a positive disposition, 16.4% neutral and 47.4% negative. We found that topic comprising most positive contents was service environment (59.6%), followed by patient safety (53.2%). With regard to the topics that contained more negative contents than positive, the most one was doctor-patient relationship (74.9%), followed by service efficiency (59.5%), and nursing service (53.0%). Notably, over one third of contents in the appointment-booking service (30.4%) revealed a neutral disposition.
Additionally, in contrast to the content volume distribution for the nine topics, the sentiment disposition of contents in corresponding healthcare services topics shows differences. For instance, Table 3 shows that the nursing service and doctor-patient relationship share an equal proportion (2.5%) of contents in the corpus, however, we observed the disposition of contents from social media users to the two topics varied in Figure 2.
To our knowledge, this is the first study that has attempted to explore the public perceptions of healthcare services, using publicly posted materials, of two Chinese social media platforms. Our results showed that patient safety was the most significant topic for users of Chinese social media platforms, followed by information technology and service efficiency. Service environment was found to have the highest proportion of positive comments.
The research assessed the application of content volume calculation and sentiment analyses on Chinese social media data. The study is a crucial step to discovering the methodology on harnessing the social media data in China and an early attempt to track the perceptions of healthcare services in the public by analyzing a unique data source.
This study found a large number of information technology and service efficiency, which might reflect the series of efforts made by both the government and the hospital in integrating information technology in healthcare services in China. Several researchers have identified that health information technology services were used to enhance patient experience, and as a potential solution to shorten the lengthy waiting time in China’s public hospital.
Humanistic care was the least mentioned topic in the corpus complied by this study. It may suggest that Chinese social media users are not very familiar with the idea of humanistic care. Those who posted about it basically expressed a positive attitude. An alternative explanation might be this type of care has yet to reach the public only experienced by a few people. Further empirical studies or controlled studies may be conducted to provide further insights.
Our research also explored the sentiment disposition of social media content on healthcare services: 47.4% provided negative feedback. Although this was only the initial results, it could be quite alarming to healthcare administrations and policymakers. Despite the fact that patient surveys generally had favorable results in China, there was still a significant amount of negative comments on the social media platforms. Further and more detailed methodology is necessary to further understand the negative comments.
In the 9 topics investigated in this study, we found huge variations in the negative feedback as well as content volumes across topics. For instance, the contents related to doctor-patient relationship only take percentage of 2.5% in the corpus, however 74.9% of the content revealed negative feedback. The varied sentiment polarity distribution of the topics may have important policy implications for healthcare reform in China. For example, 30.4% of the social media references to appointment-booking service reflected neutral feedback, which may suggest that the unsureness of the public on this novel service. Patients have yet to be familiar with the services—even though it certainly aims to improve the convenience for patients as well as hospital efficiency. Such feedback could be essential for hospitals to improve their service quality by enhancing patient education. Further research might focus on what exactly were discussed in those negative posts so that targeted measures can be employed by the hospitals and responsible administrators to improve the services.
In line with previous evidence, our results show that social media could be a useful tool for health research in China, as well as English, and could be used to capture the public’s perspective of healthcare. However, it appeared that the most concerned issue of healthcare in social media is different from what has been found in patient surveys. Findings from a recent qualitative study found that patients cared about the environment and facilities in hospital the most, whereas in our study patient safety issues had the greatest volume. Another research examined the online doctor reviews in China revealed that most posts expressed positive attitudes towards the physicians. Although the evidence on these issues are still not conclusive, it might suggest the perception difference between general public and patients.
5. Strengths and Limitations
Our research extends application of the natural language processing techniques to analysis of healthcare services related contents in China’s social medial platforms and offers a new perspective of healthcare services in China’s hospital. The results would be of benefit to healthcare providers and regulators benchmarking their performance on patient-centered healthcare delivery. This is important because the social media has been considered as a portal of health information acquisition for Chinese netizens, the perspective of social media would be supplementary in understanding how consumer views the healthcare services in hospital besides the results from traditional paper-based surveys.
Second, since we derived the healthcare services categories and lexica based on the government document on NHSII and expert consultation, thus the corpus in this study may have failed to include certain amount of healthcare services related data. As a result, we may have underestimated the content volume of healthcare services from the two social media platforms. Furthermore, although all the material in the databases are in Chinese, and therefore most likely be generated by users from China, we are currently not able to determine whether the posts, containing the key terms on healthcare, were describing the Chinese healthcare system or discussing foreign healthcare systems in Chinese language. Further research may strive to develop searching strategies that enable such distinction and increase the specificity of the results.
Third, although the consumer health vocabulary (CHV) is the gold standard reference for retrieving the target data, it has been used in previous researches, such open source of vocabulary list and its corresponding lexica are not available in Chinese language. The accuracy and credibility of the sentiment analysis of this study also await further validation; however, it would require an alternative method to conduct sentiment analyses for Chinese language and the possibility to apply such methods on the Tencent data, which were publicly posted material but still under strict terms of utilization. Another limitation concerned that we have no ability to confirm that the data supplied by Tencent completely represent all users’ data as there could be undocumented keyword filter on the platforms. These would inflict potential bias and limit the generalizability of our findings.
6. Further Research
Weibo is another popular Chinese social media platform considered to be the counterpart of Twitter in China. Future research could consider extend the analysis process to contents from Weibo, to further explore users, and their views, that have not been covered in this study.
Both the quantitative approach, as shown in our research, and the qualitative approach, such as the face-to-face individual interview method, would be useful to better understand consumer care in healthcare services. There is a scarcity of empirical research exploring the latter issue at present. It has been proposed to complement public perspectives on healthcare services.
Furthermore, the popularity of consumers’ unsolicited comments on healthcare providers in social media, prompts an important avenue for understanding patient experience, and has been demonstrated by previous researches. Future research for measuring patient experience based on social media data at hospital level would be help to better understand the landscape of healthcare quality in China.
By analyzing shared information from WeChat and Qzone, this study showed that patient safety was the most concerned topic for users of Chinese social media platform, followed by information technology and service efficiency, while the doctor-patient relationship was found to have the highest proportion of negative comments. This study explored the possibility of utilizing social media to monitor public perceptions on healthcare services. The findings provide an overview of public opinion on healthcare services, which could help regulators to set up the benchmark, on a national or regional level, to monitor the progress of healthcare improvements between comparator districts and services domains. It is also a necessary complement to the traditional paper-based consumer survey. The potential differences between social media perception and traditional consumer survey results would help regulators better understand the gap in quality of care services from various perspectives. Further studies could also focus on extending the NLP method to a more content-based resource and to expand our understanding of mass opinion on healthcare services.