So to start it all, I learned about this question from the recommended youtube channel Clément Mihailescu in his youtube video
In this video both Clément and Tim from Tech with Tim work together to solve this question:
Suppose you have two People that want to Schedule a meeting, how would you schedule it, if each Person has already set up meetings in this day and that each Person has different working hours
To help us understand it the problem provided us with this example data
Person 1 has three meetings 9:00 to 10:30, 12:00 to 13:00 and 16:00 to 18:00, this Person works from 9:00 to 20:00
Person 2 also has three meetings 10:00 to 11:30, 12:30 to 14:30 and 18:00 to 18:30 this Person works from 10:00 to 18:30
For those just starting out with R and the tidyverse I will explain each part in detail in the collapsible parts of the post.
So coding this info in R we have
library(tidyverse)
person1 <- list(list('9:00', '10:30'),
list('12:00', '13:00'),
list('16:00', '18:00'))
allowed_time1 <- list(list('9:00','20:00'))
person2 <- list(list('10:00', '11:30'),
list('12:30', '14:30'),
list('18:00', '18:30'))
allowed_time2 <- list(list('10:00','18:30'))
If you see the video Tim was able to solve it in 45 minutes while having spent a good amount of the time talking back and forth with his interviewer and explaining his reasoning, that was impressive.
Well I failed to finish this problem in 45 minutes, in fact it took close to 3 hours to stitch together the solution I am about to show, but I think my solution is something that I am proud of and that it follows most of what I love about functional programming.
Turning strings into numbers
I saw that Tim created a function to compare time in his program
Creating a function that calculates the amount of minutes in a string such as ‘10:30’
calculate_minutes <- function(x) {
x <- str_split(x,pattern = ":")
as.integer(x[[1]][[1]]) * 60 + as.integer(x[[1]][[2]])
}
calculate_minutes('10:30')
## [1] 630
explanation on calculate minutes
calculate_minutes splits the string ‘10:30’ into ‘10’ and ‘30’ and then multiples the left hand side by 60, because each hour has 60 minutes in it and then adds the right hand side the minutes to the result 630 = 10 * 60 + 30 minutes.
Reframing the problem
While at first the calculate_minutes may seem useless in both r and python since ‘14:30’ < “10:30” will return FALSE/False
We can reframe the focus on minutes overlapping of each appointment instead of time comparison like Tim used is his solution.
So based on this new idea, I created a function to convert the the strings we received into ranges of minutes, basically the interval of each appointment.
We will also need the interval of the full Day
full_day <- 1:(24*60) # 24 hours * 60 minutes -> range 1: to result
Here is the function that applies Step One and Two
calculate_interval <- . %>%
map(calculate_minutes) %>%
reduce(seq)
example_appointment <- list('10:30','10:40')
example_appointment %>% calculate_interval
## [1] 630 631 632 633 634 635 636 637 638 639 640
explanation on calculate_interval
I use “.” as a shortcut for function(x) {} it is really useful in function pipes like this one
map is a function that applies another function to all elements of an list and returns ideally the same number of elements like this,
example_list <- list(list('hi','johnny'),list('how','are','you'))
map(example_list,.f = str_to_upper)
## [[1]]
## [1] "HI" "JOHNNY"
##
## [[2]]
## [1] "HOW" "ARE" "YOU"
I also use pipes (%>%), the pipes allow us to change the nested nature of function calls into a sequential one for example
example_list %>%
map(str_to_title)
## [[1]]
## [1] "Hi" "Johnny"
##
## [[2]]
## [1] "How" "Are" "You"
And finally I use another core function of functional programming reduce, reduce works by applying the same function in a list until the is only one element left for example
reduce(1:4,sum)
## [1] 10
We then need to use this function on all of our info
person1_interval <- person1 %>%
map(calculate_interval)
person2_interval <- person2 %>%
map(calculate_interval)
allowed_time1_interval <- allowed_time1 %>%
map(calculate_interval)
allowed_time2_interval <- allowed_time2 %>%
map(calculate_interval)
example Person1
person1 %>%
map(calculate_interval)
## [[1]]
## [1] 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558
## [20] 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577
## [39] 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596
## [58] 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615
## [77] 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630
##
## [[2]]
## [1] 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738
## [20] 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757
## [39] 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776
## [58] 777 778 779 780
##
## [[3]]
## [1] 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974
## [16] 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989
## [31] 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004
## [46] 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019
## [61] 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034
## [76] 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049
## [91] 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064
## [106] 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079
## [121] 1080
Calculating occupied and available minutes
This is a simple step we collapse the lists into one
schedule_1_occupied <- person1_interval <- person1_interval %>% reduce(c)
schedule_1_avalaible <- allowed_time1_interval %>% reduce(c)
schedule_2_occupied <- person2_interval %>% reduce(c)
schedule_2_avalaible <- allowed_time2_interval %>% reduce(c)
example reduce c
Here is an example of how to collapse a simple list
list(c(1,3,4),c(2,5)) %>% reduce(c)
## [1] 1 3 4 2 5
Filtering lists
Now finally to our last core functional language function (filter), in the tidyverse filter was split into two functions discard and keep, mostly because the actual filter function is used in dplyr
We can start with the full_day and take away occupied minutes and keep valid minutes, all using filter.
possible_minutes <- full_day %>%
discard(~ .x %in% c(schedule_1_occupied,schedule_2_occupied)) %>%
keep(~ .x %in% schedule_1_avalaible) %>%
keep(~ .x %in% schedule_2_avalaible)
possible_minutes
## [1] 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708
## [19] 709 710 711 712 713 714 715 716 717 718 719 871 872 873 874 875 876 877
## [37] 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895
## [55] 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913
## [73] 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931
## [91] 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949
## [109] 950 951 952 953 954 955 956 957 958 959
explanation filter
filter works by receiving an vector or list and return only elements that have passed or failed a test for example
discard(1:10, ~ . > 5)
## [1] 1 2 3 4 5
keep(1:10, ~ . > 5)
## [1] 6 7 8 9 10
Keep in mind that while discarding we can evaluate everything together and the result stays the same, but when keeping it is important in this case to separate the call into two
Calculating start and end minutes
This function solves the problem that we as human prefer to receive just the start and end minutes instead of the whole duration
calculate_break_time <- function(intergers){
intergers <- sort(intergers)
end <- time <- which(diff(intergers) != 1)
begin <- end + 1
intergers[c(1,end,begin,length(intergers)) %>% sort()]
}
explanation calculate_break_time
This is the function that I am least happy, but basically the two minutes that we know we will need are the smallest minute that will start the first appointment and the last minute which will end the last appointment
The in between minutes are found by looking for jumps between minutes using the diff function
diff(c(1,3,4,5,9))
## [1] 2 1 1 4
If there is more than one minute of difference it is the end of an appointment and one minute later there will be the start of a new appointment
start_end_minutes <- possible_minutes %>% calculate_break_time()
start_end_minutes
## [1] 691 719 871 959
Converting minutes back into readable hours
A simple function that undoes our transformation if you need to read %/% explanation
turn_back_into_time <- function(x) {
x_hour <- x %/% 60
x_minute <- x %% 60
str_c(x_hour,x_minute,sep = ':')
}
(readable_time <- start_end_minutes %>% turn_back_into_time)
## [1] "11:31" "11:59" "14:31" "15:59"
Pairing Start and End Hours
I will over complicate the flagging of even and odd numbers to show another really cool functional concept of currying, while currying is not encouraged by the tidyverse the partial functions does make it pretty easy to use (same as Python)
The other cool concept is that you can negate a function, in this case turning the results of is_even into is_odd
diviseble_by <- function(number_vector,divisor,quocient) {
number_vector %% divisor == quocient
}
is_even <- partial(diviseble_by,divisor = 2,quocient = 0)
is_odd <- is_even %>% negate()
1:10 %>% is_even
## [1] FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE
1:10 %>% is_odd
## [1] TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE
We then use these functions to unite our start and end minutes
pair_wise_combination <- function(character_vector){
vector_i <- seq_along(character_vector)
ends <- vector_i %>% is_even
begins <- vector_i %>% is_odd
str_c(character_vector[begins],character_vector[ends],sep = " ")
}
readable_time %>% pair_wise_combination
## [1] "11:31 11:59" "14:31 15:59"
That is all folks.
My answer also works for n people
I can answer questions anywhere, please do share it if you have enjoyed it.
Big O problem and Data Science
I also failed because, I would have blankly stared into the interviewer face for a while before admitting that I have no idea the Big O of this answer is
While I do understand big O notation and its importance, I am not a Software Engineer nor a Computer Scientist, I have no idea how efficient my solution is, I know it is fast enough for me, but I understand that big O knowledge is a major difference while learning DS compared to the usual programming paths, rarely if ever people mention the Big O of our algorithms so I never deeply studied about the subject.