Mohamed Ihab Khalifa

Book Recommendation System

Book Recommendation System This project utilizes unsupervised machine learning to develop a simple but robust book recommendation system that can deliver personalized book recommendations. A recommendation system identifies the preferences of a given user and offers relevant suggestions or related content in return. For this recommendation system, the recommender would take input from the user with the name of a given book and delivers highly tailored book recommendations in return. It leverages both content-based and genre-based similarities in providing the final recommendations. Having been trained on a large dataset of books (taken from Goodreads books database) comprised of many different books, authors, genres, reviews, plot summaries and descriptions, it identifies similarities between the input book (given by the user) and other books in the database across all these different dimensions, selects and returns the most similar or most relevant ones. This book recommendation system can also filter, preprocess, and parse text to enable better matching and comparison. It also ensures author variety and can also be easily customized to increase or decrease the number of relevant recommendations or to control the degree to which the recommendations should be content-based or genre-based or a mixture of both. All this, and more, ultimately culminates into a powerful book recommender system that can be used to search for and explore new books based on one's prior preferences and book favorites. The dataset presented here was taken from Kaggle, which you can access easily by clicking here. This dataset consists of thousands of books collected from Goodreads, a popular platform for discovering, reviewing, and discussing books. Indeed, it provides a comprehensive book collection of more than 16,000 books in total, covering a myriad of different authors, genres, and literary eras, ancient and modern. It covers all the major literary works from the ancient times and up to May 2024. Each book featured, represented by a data row, covers important details and descriptions about it, including the book title, author, genre classification, publication date, format, and its average rating score. As such, the data here can support a variety of purposes, from data analysis to studying user-preferences and performing sentiment analysis to building recommendation systems, as with the current case. This dataset has been licensed by MIT for free use for commercial and noncommercial purposes. You can view each column and its description in the table below: Variable Description book_id Unique identifier for each book in the data cover_image_uri URI or URL pointing to the cover image of the book book_title Title of the book book_details Details about the book, including summary, plot, synopsis or other descriptive information format Details about the format of the book such as whether it's a hardcover, paperback, or audiobook publication_info Information about the publication of the book including the publisher, publication date, or any other relevant details authorlink URI or URL pointing to more information about the author (if available) author Name of the book author(s) num_pages Number of pages genres Genre labels applying to the book num_ratings Total number of ratings num_reviews Total number of reviews average_rating Overall average rating score rating_distribution Number of ratings per rating star (for a 5-point rating system) To develop the book recommendation system, the dataset is first inspected, cleaned, filtered, and updated in preparation for analysis and model development. After having prepared and analyzed the data, a Term Frequency - Inverse Document Frequency (TF-IDF) vectorizer model is then employed for text vectorization and processing, converting books' important attributes (including author, genres, and plot summary or book description) into numeric vectors with TF-IDF scores capturing and representing each book and how it compares to all others in the dataset. These TF-IDF scores are then compared using cosine distance similarity to measure and map out the overall similarities between the different books, returning a large data matrix with the overall similarities between books. In addition, a separate data matrix is developed for book genres alone to identify and map out the exact genre similarities between the books (using jaccard distance similarity). With the analysis and modeling coming to completion, a book recommendation function is then developed to utilize the similarity matrices obtained in order to deliver tailored book recommendations. As mentioned, this function also features different options to control the nature of the book recommendations such as whether to recommend by genre in particular or by overall similarity more generally and how many books are to be recommended. Finally, the book recommender is put to test, first testing it with well known books (e.g., Shakespeare's 'Macbeth'), then testing it using different book titles sampled at random from the database, and then lastly using user input, in which the user can pass any book they are looking for similar recommendations for and the recommendation function takes care of the rest. You can try the recommender yourself. Overall, the project is broken down into 7 sections: 1) Reading and Inspecting the Data 2) Cleaning and Updating the Data 3) Exploratory Data Analysis 4) Feature Engineering: Combining features 5) Text Vectorization and Processing 6) Building a Book Recommendation Function 7) Testing the Recommendation System In [ ]: #If you're using the executable notebook version, please run this cell first # to install the necessary Python libraries for the task !pip install numpy !pip install pandas !pip install matplotlib !pip install seaborn !pip install scipy !pip install scikit-learn In [ ]: #Importing the modules for use import re import math import requests import textwrap import numpy as np import pandas as pd from PIL import Image from io import BytesIO import seaborn as sns import matplotlib.pyplot as plt from scipy.sparse import csr_matrix from scipy.spatial.distance import squareform, pdist, jaccard from sklearn.feature_extraction.text import TfidfVectorizer from sklearn.metrics.pairwise import cosine_similarity import warnings warnings.simplefilter("ignore") sns.set_context('paper') %matplotlib inline Defining Custom Functions In [ ]: #Define function to display books by their covers def get_covers(books_df: pd.DataFrame): n_books = len(books_df.index) n_cols = ((n_books + 1) // 2) if n_books > 5 else n_books n_rows = math.ceil(n_books / n_cols) #create figure and specify subplot characeristics plt.figure(figsize=(4.2*n_cols, 6.4*n_rows), facecolor='whitesmoke') plt.subplots_adjust(bottom=.1, top=.9, left=.02, right=.88, hspace=.32) plt.rcParams.update({'font.family': 'Palatino Linotype'}) #adjust font type #request, access and plot each book cover for i in range(n_books): try: response = requests.get(books_df['cover_image_uri'].iloc[i]) except: print('\nCouldn\'t retrieve book cover. Check your internet connection and try again...\n\n', flush=True) return #access and resize image img = Image.open(BytesIO(response.content)) img = img.resize((600, 900)) #shorten and wrap book title full_title = books_df['book_title'].iloc[i] short_title = re.sub(r'[:?!].*', '', full_title) title_wrapped = "\n".join(textwrap.wrap(short_title, width=26)) #plot book cover plt.subplot(n_rows, n_cols, i+1) plt.imshow(img) plt.title(title_wrapped, fontsize=21, pad=15) plt.axis('off') plt.show() Part One: Reading and Inspecting the Data Loading and reading the dataset In [ ]: #Access and read data into dataframe df = pd.read_csv('Book_Details.csv', index_col='Unnamed: 0') #drop unnecessary columns df = df.drop(['book_id', 'format', 'authorlink', 'num_pages'], axis=1) Inspecting the data In [ ]: #report the shape of the dataframe shape = df.shape print('Number of coloumns:', shape[1]) print('Number of rows:', shape[0]) Number of coloumns: 10 Number of rows: 16225 In [ ]: #Preview first 5 entries df.head() cover_image_uri book_title book_details publication_info author genres num_ratings num_reviews average_rating rating_distribu 0 https://images-na.sslimagesamazon.com/images... Harry Potter and the HalfBlood Prince It is the middle of the summer, but there is a... ['First published July 16, 2005'] J.K. Rowling ['Fantasy', 'Young Adult', 'Fiction', 'Magic',... - 58398 4.58 {'5': '2,244,154 '775,028 '21 1 https://images-na.sslimagesamazon.com/images... Harry Potter and the Order of the Phoenix Harry Potter is about to start his fifth year ... ['First published June 21, 2003'] J.K. Rowling ['Young Adult', 'Fiction', 'Magic', 'Childrens... - 64300 4.50 {'5': '2,178,760 '856,178 '29 2 https://images-na.sslimagesamazon.com/images... Harry Potter and the Sorcerer's Stone Harry Potter has no idea how famous he is. Tha... ['First published June 26, 1997'] J.K. Rowling ['Fantasy', 'Fiction', 'Young Adult', 'Magic',... - 163493 4.47 {'5': '6,544,542 '2,348,390 '8 3 https://images-na.sslimagesamazon.com/images... Harry Potter and the Prisoner of Azkaban Harry Potter, along with his best friends, Ron... ['First published J.K. July 8, 1999'] Rowling ['Fantasy', 'Fiction', 'Young Adult', 'Magic',... - 84959 4.58 {'5': '2,892,322 '970,190 '28 4 https://images-na.sslimagesamazon.com/images... Harry Potter and the Goblet of Fire It is the summer holidays and soon Harry Potte... ['First published J.K. July 8, 2000'] Rowling ['Fantasy', 'Young Adult', 'Fiction', 'Magic',... - 69961 4.57 {'5': '2,500,070 '899,496 '25 Out[ ]:  Checking number of entries and data type per column In [ ]: #Inspect coloumn headers, data type, and number of entries df.info()  Index: 16225 entries, 0 to 16224 Data columns (total 10 columns): # Column Non-Null Count --- ------------------0 cover_image_uri 16225 non-null 1 book_title 16225 non-null 2 book_details 16177 non-null 3 publication_info 16225 non-null 4 author 16225 non-null 5 genres 16225 non-null 6 num_ratings 16225 non-null 7 num_reviews 16225 non-null 8 average_rating 16225 non-null 9 rating_distribution 16225 non-null dtypes: float64(1), int64(2), object(7) memory usage: 1.4+ MB Dtype ----object object object object object object int64 int64 float64 object Descriptive Statistics In [ ]: #get overall description of object columns display(df.describe(include='object').T) print('\n'+ 80*'_' +'\n') #get statistical summary of the numerical data display(df.describe().drop(['25%', '50%', '75%']).apply(lambda x: round(x))) count unique top freq cover_image_uri 16225 16120 https://dryofg8nmyqjw.cloudfront.net/images/no... 38 book_title 16225 15491 The Cheat Code 7 book_details 16177 16018 Libro usado en buenas condiciones, por su anti... 6 publication_info 16225 5369 ['First published January 1, 2008'] 360 author 16225 7615 Stephen King 79 genres 16225 13773 [] 325 rating_distribution 16225 16093 {'5': '0', '4': '0', '3': '0', '2': '0', '1': ... 12 ________________________________________________________________________________ num_ratings num_reviews average_rating count 16225.0 16225.0 16225.0 mean 85785.0 5156.0 4.0 std - 15776.0 0.0 min 0.0 0.0 0.0 max - - 5.0 Notably here, based on the above descriptions, we can see that we have multiple books duplicated since the total count of book titles doesn't match the total number of unique book titles in the dataset. Second, it seems that some books in the data have no descriptions or details about them since the total number of entries in the 'book_details' column is lower than all the rest. Finally, can see that many books in the dataset have no specified genre, particularly as 325 of the books featured have an empty list for the genre list column. As such, consistent with these findings, I will now perform data cleaning and updating in order to deal with each of these issues raised. First, I will drop the books duplicated in the dataset, deal with books lacking details or descriptions about them and then deal with the issue of genre, either updating some of the books by assigning the genre labels common to a particular author, provided that that said author is featured more than twice in the dataset, and, if not, then by removing the books that we couldn't find appropriate genre labels for. This is because genre is a critical factor for deciding on book similarity and recommendation, as the book recommender system to be built will leverage genre similarity not just book content. Finally, I will add a new column for year of publication, which extracts the publication year from the 'publication_info' column before dropping it as it wouldn't be too important or informative thereafter. Part Two: Cleaning and Updating the Data In this section, I will engage in data cleaning and updating based on the observations and insights reported above in order to prepare the data and render it usable for further analysis and model development. Removing duplicate books In [ ]: #first, normalize book titles by removing punctuation df['normalized_title'] = df['book_title'].apply(lambda title: re.sub(r'[^\w\s]', '', title)) #drop duplicate book titles and reset dataframe index df = df.drop_duplicates(subset='normalized_title', ignore_index=True) Dealing with missing or inappropriate book details In [ ]: #check the number of books with inappropiate book description or NaN (not a number) values print('Number of entries with NaN values in the book details column (before): ', df['book_details'].isna().sum()) #fill NaN book details with empty strings df['book_details'] = df['book_details'].fillna('') #check the number of entries after print('\nNumber of entries with NaN values in the book details column (after): ', df['book_details'].isna().sum()) Number of entries with NaN values in the book details column (before): Number of entries with NaN values in the book details column (after): 48 0 Cleaning and updating the genres column After turning the genres into a normal string, I will check the number of empty string and then assign the closest genre labels by author; otherwise, if no genre labels were found, I will delete these books with no genre. In [ ]: #Changing string list to list then to string with the genres of books df['genres'] = df['genres'].apply(lambda x: ', '.join(eval(x))) In [ ]: #Updating rows with no genre #get indices of books with no genre labels no_genre_before = df[df['genres'].str.len() == 0].index #we can preview the books identified df.iloc[no_genre_before, 1:8].head(3) book_title book_details publication_info author 570 Angels & Guides Healing Meditations You’ll find a new level of comfort, safety, an... ['First published September 1, 2006'] 2749 La Santa Muerte Narcotraficantes, políticos, delincuentes, emp... 4399 Rush Hudson Limbaugh and His Times: Reflection... This series of interviews with Rush H. Limbaug... Out[ ]: genres num_ratings num_reviews Sylvia Browne 53 1 ['First published January 31, 2004'] Homero Aridjis 29 5 ['First published November 1, 2003'] Rush Limbaugh 6 0 In [ ]: #Get total number of books with no genre before the update print('Total number of entries with missing genre (before): ', len(df.iloc[no_genre_before])) #change empty strings with genres common to given author for i in no_genre_before: genre_labels = df[df['author']==df['author'].iloc[i]]['genres'].iloc[0] if len(genre_labels) > 0: df.at[i, 'genres'] = genre_labels else: df.drop(index=i, inplace=True) #resetting dataframe index df.reset_index(drop=True, inplace=True) #check number of books with no genre after the update no_genre_after = df[df['genres'].str.len() == 0].index print('\nTotal number of entries with missing genre (after): ', len(df.iloc[no_genre_after])) Total number of entries with missing genre (before): Total number of entries with missing genre (after): 319 0 Now finally, in dealing with genre, I will try to make sure that some genres do not conflict with one another. Particularly, I'm going to make sure that if one book is has Fiction as one of its genre labels it does not simultaneously be classified as 'Nonfiction' as well, as this would mix up some of the recommendations. First, let's preview some of the books that suffer from this issue. Dealing with conflicting book genres In [ ]: #create empty list for storing indices of books with conflicting genres and set count to zero indices=[] count=0 #loop over and return all books with conflicting genres for genre_string, title in zip(df['genres'], df['book_title']): if 'Fiction' in genre_string and 'Nonfiction' in genre_string: count += 1 indices.append(df[df['book_title']==title].index) print(f'{count}. {title} // {genre_string}') 1. If I Die in a Combat Zone, Box Me Up and Ship Me Home // Nonfiction, War, History, Memoir, Military Fiction, Biography, Biography M emoir 2. Dispatches // Nonfiction, History, War, Memoir, Journalism, Military Fiction, Military History 3. The Last Stand of the Tin Can Sailors: The Extraordinary World War II Story of the U.S. Navy's Finest Hour // History, Nonfiction, Military Fiction, World War II, War, Military History, Naval History 4. Jesus Freaks: Stories of Those Who Stood for Jesus, the Ultimate Jesus Freaks // Christian, Nonfiction, Biography, Christianity, Re ligion, Faith, Christian Non Fiction 5. Flags of Our Fathers // History, Nonfiction, Military Fiction, War, World War II, Biography, Military History 6. The March of Folly // History, Nonfiction, Politics, War, World History, Military History, Military Fiction 7. The Art of War // Nonfiction, Philosophy, History, War, Business, Classics, Military Fiction 8. In Pharaoh's Army: Memories of the Lost War // Memoir, Nonfiction, War, History, Biography, Military Fiction, Biography Memoir 9. Imperial Life in the Emerald City: Inside Iraq's Green Zone // Nonfiction, History, Politics, War, Military Fiction, Journalism, Mi litary History 10. State of Denial // Politics, History, Nonfiction, War, American History, Presidents, Military Fiction 11. Charlie Wilson's War: The Extraordinary Story of How the Wildest Man in Congress and a Rogue CIA Agent Changed the History of our Times // History, Nonfiction, Politics, War, Biography, Military Fiction, American History 12. Band of Brothers: E Company, 506th Regiment, 101st Airborne from Normandy to Hitler's Eagle's Nest // History, Nonfiction, War, Mi litary Fiction, World War II, Military History, Historical 13. In Harm's Way: The Sinking of the USS Indianapolis and the Extraordinary Story of Its Survivors // History, Nonfiction, Military F iction, World War II, War, Survival, Military History 14. We Were Soldiers Once... and Young: Ia Drang - The Battle that Changed the War in Vietnam // History, Nonfiction, Military Fictio n, War, Military History, American History, Biography 15. The Fall of Berlin 1945 // History, Nonfiction, World War II, War, Military History, Germany, Military Fiction 16. The Civil War, Vol. 1: Fort Sumter to Perryville // History, Civil War, Nonfiction, American History, American Civil War, War, Mil itary Fiction 17. The Mask of Command // History, Military History, Military Fiction, Nonfiction, Leadership, War, Biography 18. Black Hawk Down: A Story of Modern War // History, Nonfiction, Military Fiction, War, Military History, Africa, Historical 19. Ghost Wars: The Secret History of the CIA, Afghanistan, and Bin Laden from the Soviet Invasion to September 10, 2001 // History, N onfiction, Politics, War, Military Fiction, Terrorism, Espionage 20. Jarhead : A Marine's Chronicle of the Gulf War and Other Battles // Nonfiction, War, Military Fiction, Memoir, History, Biography, Military History 21. Fiasco: The American Military Adventure in Iraq // History, Nonfiction, Politics, War, Military Fiction, Military History, America n History 22. Ghost Soldiers: The Epic Account of World War II's Greatest Rescue Mission // History, Nonfiction, World War II, War, Military Fic tion, Military History, American History 23. Vietnam: A History // History, Nonfiction, War, Military Fiction, Military History, American History, Politics 24. A World Undone: The Story of the Great War, 1914 to 1918 // History, Nonfiction, World War I, War, Military History, Military Fict ion, Audiobook 25. The First Day on the Somme // History, World War I, Nonfiction, War, Military History, Military Fiction, 20th Century 26. The Forgotten Soldier // History, Nonfiction, War, Military Fiction, World War II, Biography, Military History 27. This Kind of War: A Study in Unpreparedness // History, Military Fiction, Nonfiction, War, Military History, American History, Asi a 28. Henry James: A Life in Letters // Biography, Nonfiction, Classics, Literary Fiction, American 29. Company Commander: The Classic Infantry Memoir of World War II // History, Military Fiction, Military History, Nonfiction, World W ar II, War, Biography 30. Flyboys: A True Story of Courage // History, Nonfiction, World War II, War, Military Fiction, Military History, Biography 31. Hitler's War // History, World War II, Nonfiction, War, Biography, Politics, Military Fiction 32. Leadership Secrets of Attila the Hun // Leadership, Business, Nonfiction, History, Management, Self Help, Military Fiction 33. The New Dare to Discipline // Parenting, Nonfiction, Christian, Family, Self Help, Psychology, Christian Non Fiction 34. Life Application Study Bible: NIV // Christian, Religion, Nonfiction, Christianity, Reference, Spirituality, Christian Non Fiction 35. The Face of Battle: A Study of Agincourt, Waterloo and the Somme // History, Nonfiction, Military History, Military Fiction, War, European History, World War I 36. To Hell and Back // History, Nonfiction, Biography, Military Fiction, War, World War II, Military History 37. Strategy // History, Nonfiction, Military Fiction, War, Military History, Business, Politics 38. The Troubles: Ireland's Ordeal- and the Search for Peace // History, Ireland, Nonfiction, Politics, Irish Literature, Mil itary Fiction, European History 39. Against All Enemies: Inside America's War on Terror // Politics, Nonfiction, History, War, Terrorism, Military Fiction, American H istory 40. The Best and the Brightest // History, Nonfiction, Politics, War, American History, International Relations, Military Fiction 41. A Bright Shining Lie: John Paul Vann and America in Vietnam // History, Nonfiction, War, Biography, American History, Military Fic tion, Military History 42. Killing Pablo: The Hunt for the World's Greatest Outlaw // Nonfiction, History, True Crime, Crime, Biography, Military Fiction, Po litics 43. Dereliction of Duty: Lyndon Johnson, Robert McNamara, the Joint Chiefs of Staff, and the Lies That Led to Vietnam // History, Poli tics, Nonfiction, Military Fiction, War, Military History, American History 44. Enemy at the Gates: The Battle for Stalingrad // History, Nonfiction, War, World War II, Military History, Military Fiction, Russi a 45. The Coldest Winter: America and the Korean War // History, Nonfiction, War, Military History, Military Fiction, American History, Politics 46. The War: An Intimate History,- // History, Nonfiction, World War II, War, Military Fiction, American History, Military Hi story 47. An Army at Dawn: The War in North Africa,- // History, Nonfiction, World War II, Military History, War, Military Fiction, Africa 48. Quartered Safe Out Here: A Harrowing Tale of World War II // History, Nonfiction, War, Memoir, World War II, Military History, Mil itary Fiction 49. Stalingrad: The Fateful Siege,- // History, Nonfiction, War, World War II, Russia, Military History, Military Fiction 50. Mind Siege: The Battle for the Truth // Christian, Religion, Nonfiction, Christianity, Faith, Christian Non Fiction, Spirituality 51. Lectures on Faith // Religion, Lds, Nonfiction, Church, Spirituality, Lds Non Fiction, Theology 52. The Price of Admiralty: The Evolution of Naval Warfare from Trafalgar to Midway // History, Military History, Military Fiction, No nfiction, War, Naval History, European History 53. Lone Survivor: The Eyewitness Account of Operation Redwing and the Lost Heroes of SEAL Team 10 // Nonfiction, Military Fiction, Hi story, War, Biography, Memoir, Military History 54. With the Old Breed: At Peleliu and Okinawa // History, Nonfiction, War, Military Fiction, World War II, Biography, Memoir 55. The Puzzle Palace: Inside the National Security Agency, America's Most Secret Intelligence Organization // History, Nonfiction, Es pionage, Politics, Military Fiction, Technology, Government 56. The Late Great Planet Earth // Religion, Christian, Nonfiction, Christianity, Theology, Christian Non Fiction, Spirituality 57. Great Escape // History, Nonfiction, War, World War II, Military Fiction, Historical, Military History 58. Platoon Leader: A Memoir of Command in Combat // Military Fiction, History, War, Military History, Leadership, Nonfiction, Biograp hy 59. The Butterfly Dreams // Memoir, Nonfiction, War, History, Biography, Military Fiction, Biography Memoir 60. Supplying War: Logistics from Wallenstein to Patton // History, Military History, Military Fiction, War, Nonfiction, Economics, Ac ademic 61. Comrade J: Untold Secrets Of Russia's Master Spy In America After The End Of The Cold War // Nonfiction, History, Espionage, Russi a, Biography, Military Fiction, True Crime 62. The Monster Loves His Labyrinth // Poetry, Nonfiction, Literature, Literary Fiction, Essays 63. A Question of Honor: The Kosciuszko Squadron: Forgotten Heroes of World War II // History, Nonfiction, War, World War II, Poland, Aviation, Military Fiction 64. Human rights and legal defense in Northern Ireland: The intimidation of defense lawyers : the murder of Patrick Finucane // Christ ian, Prayer, Nonfiction, Spirituality, Christian Non Fiction, Faith, Christian Living 65. The Power of Praying Through the Bible // Christian, Prayer, Nonfiction, Spirituality, Christian Non Fiction, Faith, Christian Liv ing 66. Soldiers Of Reason: The RAND Corporation And The Rise Of The American Empire // History, Nonfiction, Military Fiction, Politics, S cience, American History, American 67. 1001 Books for Every Mood // Nonfiction, Books About Books, Reference, Writing, Literary Criticism, Literature, Literary Fiction 68. Lydia // History, Nonfiction, Politics, American History, War, Russia, Military Fiction 69. One Minute to Midnight: Kennedy, Khrushchev and Castro on the Brink of Nuclear War // History, Nonfiction, Politics, American Hist ory, War, Russia, Military Fiction 70. The War Path: Hitler's Germany,- // History, World War II, Nonfiction, Germany, Military Fiction 71. The Angel of Grozny: Orphans of a Forgotten War // Nonfiction, Russia, History, War, Journalism, Military Fiction, Islam 72. The Apostle: A Life of Paul // Biography, Christian, Religion, Nonfiction, History, Christianity, Christian Non Fiction 73. Kill Bin Laden: A Delta Force Commander's Account of the Hunt for the World's Most Wanted Man // Military Fiction, Nonfiction, His tory, War, Military History, Terrorism, Historical 74. The Bitter Road to Freedom: A New History of the Liberation of Europe // History, Nonfiction, World War II, War, European History, Military History, Military Fiction 75. The Battle of the Bulge // History, Nonfiction, World War II, War, Military History, Military Fiction, Audiobook 76. The Dark Side: The Inside Story of How the War on Terror Turned Into a War on American Ideals // Nonfiction, Politics, History, Wa r, Terrorism, American History, Military Fiction 77. Camille Saint-Saëns: On Music and Musicians // History, Africa, Military Fiction, South Africa, War, Nonfiction, Military History 78. Commando: A Boer Journal Of The Boer War // History, Africa, Military Fiction, South Africa, War, Nonfiction, Military History 79. Sledge Patrol: A WWII Epic Of Escape, Survival, And Victory // History, Nonfiction, World War II, Survival, Adventure, War, Milita ry Fiction 80. Radical Womanhood: Feminine Faith in a Feminist World // Christian, Nonfiction, Christianity, Christian Living, Faith, Christian N on Fiction, Theology 81. The Good Soldiers // Nonfiction, War, History, Military Fiction, Military History, Politics, Journalism 82. The Long Gray Line: The American Journey of West Point's Class of 1966 // History, Nonfiction, Military Fiction, Military History, American History, Biography, War 83. Lost in Shangri-la: A True Story of Survival, Adventure, and the Most Incredible Rescue Mission of World War II // Nonfiction, His tory, World War II, War, Adventure, Survival, Military Fiction 84. Give Me Tomorrow: The Korean War's Greatest Untold Story // History, Nonfiction, Military Fiction, Military History, War, Biograph y, Audiobook 85. Red Eagles: Americas Secret MiGs // Aviation, History, Military Fiction, Nonfiction, Military History, Aircraft, War 86. What It is Like to Go to War // Nonfiction, History, War, Military Fiction, Memoir, Biography, Psychology 87. American Sniper: The Autobiography of the Most Lethal Sniper in U.S. Military History // Nonfiction, Biography, Military Fiction, History, War, Memoir, Autobiography 88. Shot Down: The True Story of Pilot Howard Snyder and the Crew of the B-17 Susan Ruth // History, Nonfiction, Military Fiction, Adu lt, Biography, Aviation, Adventure 89. Extreme Ownership: How U.S. Navy SEALs Lead and Win // Leadership, Business, Nonfiction, Self Help, Personal Development, Manageme nt, Military Fiction 90. Defeating Jihad: The Winnable War // Politics, Nonfiction, History, Military Fiction, Terrorism, Military History, War 91. Real Friends // Graphic Novels, Middle Grade, Memoir, Comics, Childrens, Realistic Fiction, Nonfiction 92. Grunt: The Curious Science of Humans at War // Nonfiction, Science, War, History, Military Fiction, Humor, Audiobook 93. Is Goat Beef? // Nonfiction, Humor, War, True Story, Military Fiction, History, Adult 94. Huế 1968: A Turning Point of the American War in Vietnam // History, Nonfiction, War, Military History, Military Fiction, American History, Asia 95. Vietnam: An Epic Tragedy,- // History, Nonfiction, War, Military History, Military Fiction, American History, Politics 96. The Guns of August // History, Nonfiction, War, World War I, Military Fiction, Military History, Politics 97. Whispers In The Tall Grass // History, Military Fiction, Nonfiction, War, Biography, Military History, Memoir 98. Operation Pedestal: The Fleet That Battled to Malta, 1942 // History, Nonfiction, World War II, Military History, War, Military Fi ction, Historical 99. The Bomber Mafia: A Dream, a Temptation, and the Longest Night of the Second World War // History, Nonfiction, Audiobook, War, Wor ld War II, Military Fiction, Historical 100. The Mosquito Bowl: A Game of Life and Death in World War II // Nonfiction, History, Sports, World War II, Military Fiction, War, Football 101. Prisoners of the Castle: An Epic Story of Survival and Escape from Colditz, the Nazis' Fortress Prison // History, Nonfiction, Wo rld War II, War, Historical, Biography, Military Fiction 102. Diplomats & Admirals: From Failed Negotiations and Tragic Misjudgments to Powerful Leaders and Heroic Deeds, the Untold Story of the Pacific War from Pearl Harbor to Midway // History, Nonfiction, War, Military Fiction, World War II, Japan, Politics As demonstrated, most of the books featured here tend to be books about historical wars, persumably with an element of fiction, hence they tend to be classified as 'Nonfiction' and simultaneously as 'Military Fiction'. We also have a few books classified as both 'Nonfiction' and 'Literary Fiction'. Similarly, there's at least one book classified as both 'Nonfiction' and 'Realistic Fiction'. These seem to be literary works with a mixture of both indeed. And finally, we have a few other books classified as 'Nonfiction' and 'Christian Non Fiction'. Now, in order to deal with this, I will simply replace 'Military Fiction' with 'Military' and 'Literary Fiction' with 'Literary'. Finally, for the purposes of accurate text processing, I will change the genre label 'Christian Non Fiction' to simply 'Christian Nonfiction', joining the last two words together. In [ ]: #create dictionary with sub-strings to be replaced or removed replacements_dict = { 'Military Fiction': 'Military', 'Literary Fiction': 'Literary', 'Realistic Fiction': 'Realistic', 'Non Fiction': 'Nonfiction' } #replace substrings according to specified values df['genres'] = df['genres'].replace(replacements_dict, regex=True) #Now we can check again count=0 for genre_string, title in zip(df['genres'], df['book_title']): if 'Fiction' in genre_string and 'Nonfiction' in genre_string: count += 1 print(f'Number of books with conflicting genres: {count}') Number of books with conflicting genres: 0 Creating a column with publication year In [ ]: #Changing string list in publication info column to normal string df['publication_info'] = df['publication_info'].apply(lambda x: eval(x)[0] if len(eval(x)) > 0 else 'n.d.') #extract year of publication from publication info column and assign it to a new data column, 'publication_year' (if 'n.d.' assign an df['publication_year'] = df['publication_info'].str.extract(r'(\d{1,4}$)').fillna('') #preview changes and new publication year column df[['publication_info', 'publication_year']].sample(5) publication_info publication_year 547 First published December 1, 1980 1980 7195 First published January 1, 1937 1937 8293 First published January 1, 1798 1798 1428 First published June 7, 1926 1926 6407 First published January 28, 2003 2003 Out[ ]: Part Three: Exploratory Data Analysis In this section, I will explore the dataset in more detail, performing some further data analysis and visualization to get familiar with the data and delineate some of the underlying relationships. I will examine the most common book genres in the data, the most top rated books, the rating distribution and the relationship between user ratings and user reviews. Top 20 book genres featured in the data In [ ]: #Create one-hot encoded dataframe with all unique genres in the data genres_df = df['genres'].str.get_dummies(', ').astype(int) #preview genres dataframe genres_df.head() Out[ ]: 12th Century 13th Century 15th Century 16th Century 17th Century 18th Century 19th Century 1st Grade 20th Century 21st Century ... X Men Yaoi Young Adult Young Adult Contemporary Young Adult Fantasy 0 0 0 0 0 0 0 0 0 0 0 ... 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 ... 0 0 1 0 0 2 0 0 0 0 0 0 0 0 0 0 ... 0 0 1 0 0 3 0 0 0 0 0 0 0 0 0 0 ... 0 0 1 0 0 4 0 0 0 0 0 0 0 0 0 0 ... 0 0 1 0 0 5 rows × 727 columns  We can see here we have a total of 727 unique genre classifications! Now, I will identify and present the top 20 most features book genres.  H In [ ]: #Extract top 20 genres by genre frequency top20_genres = genres_df.sum().sort_values(ascending=False)[:20] #Visualize top 20 genres using bar chart top20_genres.plot(kind='bar', color='#24799e', width=.8, linewidth=.8, edgecolor='k', rot=90) Out[ ]: Top 10 books on Goodreads In [ ]: #Assign appropriate data type to the rating distribution column df['rating_distribution'] = df['rating_distribution'].apply(lambda x: eval(x)) #get total number of five star ratings per book from the rating distribution column df['total_5star_ratings'] = [int(dic['5'].replace(',','')) for dic in df['rating_distribution']] #sort data by books with highest frequency of 5 star ratings top10_books = df.sort_values(by='total_5star_ratings', ascending=False).iloc[:10][['book_title', 'author', 'genres', 'cover_image_uri #report the results table top10_books.iloc[:,:3] book_title author genres Harry Potter and the Sorcerer's Stone J.K. Rowling Fantasy, Fiction, Young Adult, Magic, Children... 1 The Hunger Games Suzanne Collins Young Adult, Fiction, Fantasy, Science Fiction... 2 To Kill a Mockingbird Harper Lee Classics, Fiction, Historical Fiction, School,... 3 Harry Potter and the Prisoner of Azkaban J.K. Rowling Fantasy, Fiction, Young Adult, Magic, Children... 4 Harry Potter and the Deathly Hallows J.K. Rowling Fantasy, Young Adult, Fiction, Magic, Children... 5 Harry Potter and the Goblet of Fire J.K. Rowling Fantasy, Young Adult, Fiction, Magic, Children... 6 The Fault in Our Stars John Green Young Adult, Fiction, Contemporary, Realistic,... 7 Twilight Stephenie Meyer Fantasy, Young Adult, Romance, Fiction, Vampir... 8 Pride and Prejudice Jane Austen Fiction, Historical Fiction, Historical, Liter... 9 Harry Potter and the Chamber of Secrets J.K. Rowling Fantasy, Fiction, Young Adult, Magic, Children... Out[ ]: 0 In [ ]: #get and display books by cover get_covers(top10_books) Distribution of rating scores In [ ]: #Aggregate ratings by rating star rating_counts = {'5':0, '4':0, '3':0, '2':0, '1':0} for ratings in df['rating_distribution']: for key, value in ratings.items(): rating_counts[key] += int(value.replace(',','')) #plot the ratings frequency distribution plt.figure(figsize=(7,5)) plt.bar(rating_counts.keys(), rating_counts.values(), color='#24799e', width=.7, linewidth=.8, edgecolor='k') plt.title('Frequency Distribution of Star Ratings', fontsize=11) plt.xlabel('Star Rating', fontsize=10) plt.ylabel('Frequency of Rating', fontsize=10) plt.grid(axis='y', linestyle='-', alpha=.7) plt.show() Relationship between number of ratings and average rating score In [ ]: #Visualize the relationship between the number of ratings and the average rating # score for a given book using scatter plot plt.figure(figsize=(9,5)) sns.scatterplot(data=df, x='num_ratings', y='average_rating') plt.gcf().axes[0].xaxis.get_major_formatter().set_scientific(False) plt.xticks(rotation=-30) plt.title('Relationship between Number of Ratings and Average Rating', fontsize=13) plt.xlabel('Number of Ratings', fontsize=11.5) plt.ylabel('Average Book Rating', fontsize=11.5) plt.show() As depicted by the above plot, there is a positive relationship between the number of ratings and the average rating score of a given book. Users generally tend to give more ratings if find the book favorable and deserving of a high rating score. Now that gathered an overview of the data, I will next move to performing an important feature engineering step to prepare the data for modeling and text processing, particularly, I will create a new column, 'combined_features' that combines all the important book features together, which would be crucial for subsequent analysis applying text vectorization and processing. Part Four: Feature Engineering—Combining Features In [ ]: #Combine features for ovarall text processing df['combined_features'] = (df['book_title'] + ' / ' + df['author'] + ' / ' + df['publication_year'] + ' / ' + df['genres'] + ' / ' + #Preview a sample of the combined features column for row in df['combined_features'].sample(5): print(row[:200],'\n') Inkspell / Cornelia Funke / 2005 / Fantasy, Young Adult, Fiction, Middle Grade, Childrens, Adventure, Magic / 3.94 / The captivating s equel to INKHEART, the critically acclaimed, international bestsel The Stone Raft / José Saramago / 1986 / Fiction, Portugal, Magical Realism, Literature, Portuguese Literature, Nobel Prize, Novels / 3.82 / When the Iberian Peninsula breaks free of Europe and begins The Gods of Mars / Edgar Rice Burroughs / 1913 / Science Fiction, Fantasy, Fiction, Classics, Adventure, Pulp, Science Fiction Fantasy / 3.88 / After the long exile on Earth, John Carter finally retur The Second Korean War / Ted Halstead / 2018 / Fiction, War, Military / 4.15 / "This book was like Tom Clancy reincarnated. Ted Halstea d really knows how to write a thriller. Can't wait for more!"Two R All Dreamers Go to America / Ana Ingham / 2009 / Drama, Novels, Fiction, Contemporary / 4.23 / Ana Ingham's delightful novel, All Drea mers Go to America, is a wonderful tale about a young man who has Part Five: Text Vectorization and Processing In this section, I will employ TF-IDF to perform text vectorization, weighting the importances of terms in relation to the description of a single book and relative to the descriptions of other books. The TF-IDF vectorizer would also be supplied with different checks and measures to process text better, standardizing terms, filtering out the most common terms and meaningless terms or typos, and so on. The vectorizer would then return a data matrix with all the terms and term weights per book which would then be utilized for further analysis. Particularly, the next step would be to measure and quanity the similarities between books based off the obtained data matrix. As such, to identify the similarities between the books, I will proceed by using the cosine distance similarity metric to measure term similarities across the different books in the data. A cosine similarity of 1 should indicate identity, or, in this context, maximum overall similarity, whilst cosine similarity of 0 should indicate zero commonality. The resulting similarity matrix would represent the overall similarities between the different books in the dataset. This would cover all the important features of a book, including the author, book genres, and plot summary or description, as depicted by the 'combined_features' column defined above. Finally, a separate similarity matrix will be created for genre alone. To identify genre similarities, I will employ jaccard distance which would quantify the genre commonalities/uncommonalities across the books in the dataset. This would enable us to improve and better tailor book recommendations by leveraging genre similarity along with overall similarity. In [ ]: #Define custom tokenizer to process text better def my_tokenizer(text): #Remove punctuation and standardize text (all in lowercase, no whitespace) tokens = re.findall(r'\b\w+\b', text.lower().strip()) return tokens Identifying overall similarity: Text vectorization with TF-IDF In [ ]: #Create TF-IDF object and set text vectorization characteristics tfidf_vectorizer = TfidfVectorizer(stop_words='english', #remove common english words (e.g., the, then) tokenizer=my_tokenizer, #specify text tokenizer (to process and standardize terms) ngram_range=(1,2), #specify n-gram range min_df=2) #specify min_df to filter out uncommon terms #fit and transform the data to get a TF-IDF matrix tfidf_mtrx = tfidf_vectorizer.fit_transform(df['combined_features']) #Now computing cosine distance similarity #calculate cosine distance similarity to obtain similarity matrix similarity_mtrx = cosine_similarity(tfidf_mtrx, tfidf_mtrx) Identifying genre similarity Now I will create a similarity matrix for genre alone (using jaccard distance similarity). First, I will turn my genres dataframe into a sparse matrix for faster processing and then compute the jaccard distance similarity to obtain a similarity matrix for genre alone. In [ ]: #Convert genres_df to CSR matrix genres_csr_mtrx = csr_matrix(genres_df.values).astype(bool).toarray() #Compute jaccard distance similarity and return jaccard similarity matrix genre_sim_mtrx = 1 - squareform(pdist(genres_csr_mtrx, metric=jaccard)) #normalize jaccard distance scores genre_sim_mtrx = genre_sim_mtrx / np.max(genre_sim_mtrx) if np.max(genre_sim_mtrx) > 0 else genre_sim_mtrx Now with all the data processed and analyzed throughly, I will build the main function for tailoring and delivering book recommendations. Part Six: Building a Book Recommendation Function In this section, I will develop a custom function for delivering personalized book recommendations. This function will constitute the heart of the book recommendation system. It will take a book title as input and return the most relevant book recommendations based off that book, utilizing and balancing the similarity matrices obtained, leveraging overall similarity as well as genre similarity. It will also be supplied with a special parameter, alpha , which specifies the exact balance between the two matrices, i.e., whether the recommendations should be tailored by genre similarity alone or overall similarity, or a mixture of both, and, if so, to which extent. It's will also feature another parameter, N , which specifies the exact number of book recommendations to return. The output would be a data table rendering the recommendation results as well as displaying each book by its cover in a sequential order. You can read the function's documentation for more details. In [ ]: #Define helper functions to return book recommendations def Get_Recommendations(title: str, sim_mtrx: np.ndarray, genre_sim_mtrx: np.ndarray, alpha=0.5, N=10): """ This function takes a book title and recommends similar books that cover similar themes or fall within the same genre categories. Parameters: - title (str): The title of the book for which recommendations are sought. - sim_mtrx (ndarray): A similarity matrix based on book overall similarities, where each row corresponds to a book and each column corresponds to its cosine similarity score with other books. - genre_sim_mtrx (ndarray): A similarity matrix based on book genres, where each row corresponds to a book and each column corresponds to its jaccard similarity score with other books based on genre. - alpha (float, optional): Weighting factor for combining overall similarity and genre similarity. Defaults to 0.5, balancing overall similarity and genre similarity together. - N (int, optional): Number of recommendations to return. Defaults to 10. Returns: - Data table (Series) with recommended books and plot of each book with its cover. Raises: - TypeError: If the title provided is not a string. Notes: - This function filters, preprocesses and standardizes the book titles given, identifies its genre categories, importantly, identifying whether it's Fiction or Nonfiction work to prevent genre overall while looking for recommendations. - It looks for book recommendations by combining similarity scores from two matrices: sim_mtrx (based on overall similarities) and genre_sim_mtrx (based on genres). - It prioritizes books with similar genre categories; otherwise, it recommends book based on overall book similarity. - Finally, recommendations are filtered to include books by a different variety of authors, limiting the number of recommendations to only 5 books per one author. - The number of book recommendations can be adjusted using the 'N' parameter. Default is 10 book recommendations. """ #check if title provided is of the correct data type (string) try: curr_title = str(title) except: raise TypeError('Book title entered is not string.') #standardize titles for accurate comparisons title = curr_title.lower().strip() full_titles = df['book_title'].apply(lambda title: title.lower().strip()) partial_titles = full_titles.str.extract(r'^(.*?):')[0].dropna() #check if provided title matches book title in the dataset and get index if found if title in full_titles.values: indx = df[full_titles == title].index[0] elif title in set(partial_titles.values): indx_partial = partial_titles[partial_titles == title].index[0] indx = df[df['book_title'] == df['book_title'].iloc[indx_partial]].index[0] else: #try normalizing book titles across the board by removing punctuations and removing 'the' if the book starts with it for bett normalized_title = re.sub(r'(^\s*(the|a)\s+|[^\w\s])', '', title, flags=re.IGNORECASE) normalized_full_titles = full_titles.apply(lambda title: re.sub(r'(^\s*(the|a)\s+|[^\w\s])', '', title, flags=re.IGNORECASE)) normalized_partial_titles = partial_titles.apply(lambda title: re.sub(r'(^\s*(the|a)\s+|[^\w\s])', '', title, flags=re.IGNORE if normalized_title in normalized_full_titles.values: indx = df[normalized_full_titles == normalized_title].index[0] elif normalized_title in set(normalized_partial_titles.values): indx_partial = normalized_partial_titles[normalized_partial_titles==normalized_title].index[0] indx = df[df['book_title'] == df['book_title'].iloc[indx_partial]].index[0] else: print(f'\nBook with title \'{curr_title}\' is not found. Please try a different book.\n', flush=True) return False #Check if 'Fiction' is in the genre of the selected book is_fiction = 'Fiction' in df['genres'].iloc[indx] #Find books with the same genre category if is_fiction: book_indices_ByGenre = [i for i in df.index if ('Fiction' in df['genres'].iloc[i]) and (i != indx)] else: book_indices_ByGenre = [i for i in df.index if ('Fiction' not in df['genres'].iloc[i] or 'Nonfiction' in df['genres'].iloc[i] #Combine the two similarity matrices using weighted sum weighed_similarity = (alpha * sim_mtrx[indx]) + ((1 - alpha) * genre_sim_mtrx[indx]) #Get cosine similarity scores for books with the same genre similarity_scores = [(i, weighed_similarity[i]) for i in book_indices_ByGenre] #Filter scores to only include books with the same genre category similarity_scores = [score for score in similarity_scores if score[0] in book_indices_ByGenre] #Sort the books based on the genre similarity scores similarity_scores = sorted(similarity_scores, key=lambda x: x[1], reverse=True) #If less than N books are found in the same genre category, add books by closest overall cosine distance if len(similarity_scores) < N: cos_scores = list(enumerate(weighed_similarity[indx])) cos_scores = sorted(cos_scores, key=lambda x: x[1], reverse=True) cos_scores = [score for score in cos_scores if score[0] != indx and score[0] not in [x[0] for x in similarity_scores]] similarity_scores += [score for score in cos_scores if score not in similarity_scores][:N - len(similarity_scores)] #Excl #Limit recommendations to 5 books per author author_counts = {} similarity_scores_filtered = [] for score in similarity_scores: author = df['author'].iloc[score[0]] if author not in author_counts or author_counts[author] < 5: similarity_scores_filtered.append(score) author_counts[author] = author_counts.get(author, 0) + 1 #Get the scores of the N most similar books most_similar_books = similarity_scores_filtered[:N] #Get the indices of the books selected most_similar_books_indices = [i[0] for i in most_similar_books] #Prepare DataFrame with recommended books and their details recommended_books = df.iloc[most_similar_books_indices][['book_title', 'author', 'cover_image_uri']] recommended_books['Recommendation'] = recommended_books.apply(lambda row: f"{row['book_title']} (by {row['author']})", axis=1) recommended_books.reset_index(drop=True, inplace=True) #Return book recommendations print(f"\nRecommendations for '{curr_title.title()}' (by {df['author'].iloc[indx]}):", flush=True) display(recommended_books['Recommendation'].to_frame().rename(lambda x:x+1)) print('\n', flush=True) get_covers(recommended_books) return Part Seven: Testing the Recommendation System Finally, in this last section I will test out the recommendation system. This will unfold in three steps. First, I will test the recommender by passing a popular book title to the function, such as Shakespeare's Macbeth, and execute it to obtain the relevant book recommendations. Second, I will obtain a sample of books picked at random from the dataset and pass them to the function to obtain and evaluate the recommendations for each. And finally, I will develop a function that takes book titles from the user as input and return the relevant book recommendations when available. In [ ]: #Adjust pandas display settings to display entire column pd.set_option('display.max_colwidth', None) Generating Book Recommendation for Famous Title In [ ]: #Get 10 book recommendations for 'Macbeth' (by Shakespeare) book_title = 'Macbeth' Get_Recommendations(book_title, similarity_mtrx, genre_sim_mtrx, alpha=0.7, N=10) Recommendations for 'Macbeth' (by William Shakespeare): Recommendation 1 Hamlet (by William Shakespeare) 2 Othello (by William Shakespeare) 3 King Lear (by William Shakespeare) 4 Romeo and Juliet (by William Shakespeare) 5 The Merchant of Venice (by William Shakespeare) 6 Hamlet: Screenplay, Introduction And Film Diary (by Kenneth Branagh) 7 Doubt, a Parable (by John Patrick Shanley) 8 The Oedipus Cycle: Oedipus Rex, Oedipus at Colonus, Antigone (by Sophocles) 9 Oedipus Rex (by Sophocles) 10 Antigone (by Sophocles) Generating Book Recommendations from Random Titles In [ ]: #Get recommendations for titles chosen at random random_titles = df.sample(5)[['book_title','author']] #get recommendations for the selected titles for title,author in zip(random_titles.iloc[:,0],random_titles.iloc[:,1]): Get_Recommendations(title, similarity_mtrx, genre_sim_mtrx, alpha=0.7, N=10) print('\n', 150*'_' + '\n') Recommendations for 'Persuader' (by Lee Child): Recommendation 1 One Shot (by Lee Child) 2 Nothing to Lose (by Lee Child) 3 Bad Luck and Trouble (by Lee Child) 4 Worth Dying For (by Lee Child) 5 Tripwire (by Lee Child) 6 Dragan Radelscu & The Vampires Of Paris (by Shamus Sherwood) 7 Face of a Killer (by Robin Burcell) 8 The Nowhere Man (by Gregg Andrew Hurwitz) 9 The Lion's Game (by Nelson DeMille) 10 Saving Faith (by David Baldacci) _____________________________________________________________________________________________________________________________________ _________________ Recommendations for 'Silent Night 2' (by R.L. Stine): Recommendation 1 Silent Night (by R.L. Stine) 2 All-Night Party (by R.L. Stine) 3 The Face (by R.L. Stine) 4 The Secret Bedroom (by R.L. Stine) 5 Night of the Living Dummy (by R.L. Stine) 6 After the First Death (by Robert Cormier) 7 Greenglass House (by Kate Milford) 8 Frozen Charlotte (by Alex Bell) 9 172 Hours on the Moon (by Johan Harstad) 10 Full Tilt (by Neal Shusterman) _____________________________________________________________________________________________________________________________________ _________________ Recommendations for 'The Postman' (by David Brin): Recommendation 1 This Is The Way The World Ends (by James K. Morrow) 2 The Gone-Away World (by Nick Harkaway) 3 Earth Abides (by George R. Stewart) 4 Alas, Babylon (by Pat Frank) 5 Farnham's Freehold (by Robert A. Heinlein) 6 Swan Song (by Robert McCammon) 7 The White Plague (by Frank Herbert) 8 Wool (by Hugh Howey) 9 Go-Go Girls of the Apocalypse (by Victor Gischler) 10 Dies the Fire (by S.M. Stirling) _____________________________________________________________________________________________________________________________________ _________________ Recommendations for 'The Black Circle' (by Patrick Carman): Recommendation 1 Beyond the Grave (by Jude Watson) 2 One False Note (by Gordon Korman) 3 The Viper's Nest (by Peter Lerangis) 4 Into the Gauntlet (by Margaret Peterson Haddix) 5 The Sword Thief (by Peter Lerangis) 6 In Too Deep (by Jude Watson) 7 The Emperor's Code (by Gordon Korman) 8 Storm Warning (by Linda Sue Park) 9 The Maze of Bones (by Rick Riordan) 10 The Ersatz Elevator (by Lemony Snicket) _____________________________________________________________________________________________________________________________________ _________________ Recommendations for 'Chase The Dark' (by Annette Marie): Recommendation 1 Demons at Deadnight (by A. Kirk) 2 Stone Cold Touch (by Jennifer L. Armentrout) 3 Immortal Beloved (by Cate Tiernan) 4 Dead Man Rising (by Lilith Saintcrow) 5 The Gathering Darkness (by Lisa Collicutt) 6 If I Die (by Rachel Vincent) 7 White Hot Kiss (by Jennifer L. Armentrout) 8 Carrier of the Mark (by Leigh Fallon) 9 Every Last Breath (by Jennifer L. Armentrout) 10 The Indigo Spell (by Richelle Mead) _____________________________________________________________________________________________________________________________________ _________________ Generating Book Recommendations from User Input In [ ]: #Defining custom function that requests a book title from the user and returns relevant book recommendations def Get_Recommendations_fromUser(): while True: book_title = input('\nEnter book title: ') recommendations = Get_Recommendations(book_title, similarity_mtrx, genre_sim_mtrx, alpha=0.7, N=10) print('\n', 150*'_' + '\n', flush=True) if recommendations is not False: response = str(input('\n\nWould you like to get recommendations for more books? [Yes/no]\n')).lower().strip() if response in ['yes', 'y']: continue elif response in ['no', 'n']: print('\nThank you for trying the recommender.\nExiting...') break else: print('\nResponse invalid.\nProcess terminating...') break In [ ]: #Execute the user recommender function Get_Recommendations_fromUser() # The Great Gatsby; Return of the king; Atomic Habit; Atomic Habits; a brief history of time; Critiqu Recommendations for 'The Great Gatsby' (by F. Scott Fitzgerald): Recommendation 1 F. Scott Fitzgerald: The Great Gatsby (by Nicolas Tredell) 2 This Side of Paradise (by F. Scott Fitzgerald) 3 The Beautiful and Damned (by F. Scott Fitzgerald) 4 The Love of the Last Tycoon (by F. Scott Fitzgerald) 5 Great Expectations (by Charles Dickens) 6 The Pearl (by John Steinbeck) 7 Ethan Frome (by Edith Wharton) 8 The Portrait of a Lady (by Henry James) 9 Old School (by Tobias Wolff) 10 A Tale of Two Cities (by Charles Dickens) _____________________________________________________________________________________________________________________________________ _________________ Recommendations for 'Return Of The King' (by J.R.R. Tolkien): Recommendation 1 The Two Towers (by J.R.R. Tolkien) 2 J.R.R. Tolkien 4-Book Boxed Set: The Hobbit and The Lord of the Rings (by J.R.R. Tolkien) 3 The Fellowship of the Ring (by J.R.R. Tolkien) 4 The Lord of the Rings (by J.R.R. Tolkien) 5 Orcs (by Stan Nicholls) 6 The Children of Húrin (by J.R.R. Tolkien) 7 New Spring (by Robert Jordan) 8 The Great Hunt (by Robert Jordan) 9 The Shadow Rising (by Robert Jordan) 10 The Eye of the World (by Robert Jordan) _____________________________________________________________________________________________________________________________________ _________________ Book with title 'Atomic Habit' is not found. Please try a different book. _____________________________________________________________________________________________________________________________________ _________________ Recommendations for 'Atomic Habits' (by James Clear): Recommendation 1 The Power of Habit: Why We Do What We Do in Life and Business (by Charles Duhigg) 2 Better Than Before: Mastering the Habits of Our Everyday Lives (by Gretchen Rubin) 3 Deep Work: Rules for Focused Success in a Distracted World (by Cal Newport) 4 Eat That Frog! 21 Great Ways to Stop Procrastinating and Get More Done in Less Time (by Brian Tracy) 5 13 Things Mentally Strong People Don't Do: Take Back Your Power, Embrace Change, Face Your Fears, and Train Your Brain for Happiness and Success (by Amy Morin) 6 How Women Rise: Break the 12 Habits Holding You Back from Your Next Raise, Promotion, or Job (by Sally Helgesen) 7 The 7 Habits of Highly Effective People: Powerful Lessons in Personal Change (by Stephen R. Covey) 8 Getting Things Done: The Art of Stress-Free Productivity (by David Allen) 9 You Are a Badass: How to Stop Doubting Your Greatness and Start Living an Awesome Life (by Jen Sincero) 10 Building a Second Brain: A Proven Method to Organize Your Digital Life and Unlock Your Creative Potential (by Tiago Forte) _____________________________________________________________________________________________________________________________________ _________________ Recommendations for 'A Brief History Of Time' (by Stephen Hawking): Recommendation 1 A Briefer History of Time (by Stephen Hawking) 2 The Universe in a Nutshell (by Stephen Hawking) 3 Black Holes & Time Warps: Einstein's Outrageous Legacy (by Kip S. Thorne) 4 The Grand Design (by Stephen Hawking) 5 Wrinkles in Time (by George Smoot) 6 Parallel Worlds: A Journey through Creation, Higher Dimensions, and the Future of the Cosmos (by Michio Kaku) 7 The Principia : Mathematical Principles of Natural Philosophy (by Isaac Newton) 8 The Elegant Universe: Superstrings, Hidden Dimensions, and the Quest for the Ultimate Theory (by Brian Greene) 9 Billions & Billions: Thoughts on Life and Death at the Brink of the Millennium (by Carl Sagan) 10 The Fabric of the Cosmos: Space, Time, and the Texture of Reality (by Brian Greene) _____________________________________________________________________________________________________________________________________ _________________ Recommendations for 'Critique Of Pure Reason' (by Immanuel Kant): Recommendation 1 Groundwork of the Metaphysics of Morals (by Immanuel Kant) 2 Phenomenology of Spirit (by Georg Wilhelm Friedrich Hegel) 3 Being and Time (by Martin Heidegger) 4 Individuals: An Essay in Descriptive Metaphysics (by Peter Frederick Strawson) 5 An Enquiry Concerning Human Understanding (by David Hume) 6 100 Questions About God (by J. Edwin Orr) 7 Our Knowledge of the External World (by Bertrand Russell) 8 Difference and Repetition (by Gilles Deleuze) 9 Beyond Good and Evil: Prelude to a Philosophy of the Future (by Friedrich Nietzsche) 10 On the Genealogy of Morals / Ecce Homo (by Friedrich Nietzsche) _____________________________________________________________________________________________________________________________________ _________________ Thank you for trying the recommender. Exiting...

Scheduled maintenance